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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched : 



February 7, 2002, 10:57:42 ; Search time 3842.15 Seconds 

(without alignments ) 
1824.838 Million cell updates/sec 

US-09-394-745-6332 
425 

1 cggacgcgtgggtgcaattt tgtggtgcctctctcaacct 425 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

1472140 seqs, 8248589755 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



2944280 



Database 



GenEmbl : * 

1 : gb_ba : * 

2: gbjitg:* 

3: gb_in:* 

4 : gb_om : * 

5 : gb_ov : * 

6 : gb_pat : * 

7 : gb_ph : * 

8 : gb_pl : * 

9: gb_pr:* 

10: gb_ro:* 

11: gb_sts:* 

12: gb_sy:* 

13: gb_un:* 

14: gb_vi : * 

15: em_ba:* 

1 6 : em_f un : * 

1 7 : em_hum : * 

18: em_in:* 

1 9 : em_om : * 

20: em_or:* 

2 1 : em_ov : * 

22: em_pat:* 

2 3 : em_ph : * 

2 4 : em_pl : * 

25: em_ro:* 

26: em_sts:* 

27: em_sy:* 



28 


em 


un : * 


29 


em 


vi : * 




em 


ntyo n urn , 


31 


em 


htgo inv:* 


32 


em 


htgo rod:* 


33 


em 


htg hum:* 


34 


em 


htg inv:* 


35 


em_ 


_htg_rod: * 


36 


em 


_htg_other : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No. 


Score 


Match 


Length 


DB 


ID 


Description 




1 


35 


.6 


8 


.4 


116841 


2 


AP000643 


AP000643 Homo sapi 


c 


2 


35 


.6 


8 


.4 


153084 


2 


AC079888 


AC079888 Oryza sat 


c 


3 


35 


.2 


8 


.3 


114525 


2 


AF238279 


AF238279 Homo sapi 


c 


4 


34 


.8 


8 


.2 


200000 


2 


AC004618 


AC004618 Homo sapi 


c 


5 


34 


.8 


8 


.2 


200000 


2 


AC004624 


AC004624 Homo sapi 




6 


34 


.4 


8 


.1 


165909 


2 


AC079152 


AC07 9152 Homo sapi 


c 


7 


34 


.2 


8 


.0 


110000 


2 


LMFLCHR34_05 


Continuation (6 of 




8 


34 


.2 


8 


.0 


207418 


2 


AC008676 


AC008676 Homo sapi 




9 


34 


.2 


8 


.0 


340900 


1 


SME591791 


AL591791 Sinorhizo 


c 


10 


33 


.8 


8 


.0 


20343 


5 


GGVITIIG 


X13607 Chicken vit 


c 


11 


33 


.8 


8 


.0 


194575 


2 


AC023140 


AC023140 Homo sapi 




12 


33 


.6 


7 


.9 


141307 


8 


AC084763 


AC084763 Oryza sat 




13 


33 


. 6 


7 


.9 


146921 


8 


AP002836 


AP002836 Oryza sat 


c 


14 


33 


. 6 


7 


.9 


162700 


2 


AC024105 


AC024105 Homo sapi 


c 


15 


33 


. 6 


7 


.9 


178141 


2 


AC07 4 34 5 


AC074345 Homo sapi 




16 


33 


. 6 


7 


.9 


179714 


8 


AP002743 


AP002743 Oryza sat 


c 


17 


33 


.4 


7 


.9 


138902 


9 


HSA213H19 


AL109749 Human DNA 




18 


33 


.2 


7 


.8 


11541 


1 


AE003960 


AE003960 Xylella f 


c 


19 


33 


7 


.8 


143411 


10 


AC009361 


AC009361 Mus muscu 


c 


20 


33 


7 


.8 


200792 


2 


AC087540 


AC087540 Mus muscu 


c 


21 


33 


7 


.8 


205884 


2 


AC068241 


AC068241 Mus muscu 




22 


32 


. 6 


7 


.7 


87417 


2 


AC016571 


AC016571 Homo sapi 




23 


32 


. 6 


7 


.7 


151183 


9 


AC004932 


AC004932 Homo sapi 


c 


24 


32 


. 4 


7 


.6 


4 36.0 


6 


AX180877 


AX180877 Sequence 




25 


32 


. 4 


7 


.6 


7988 


10 


MMU05265 


U05265 Mus musculu 


c 


26 


32 


.4 


7 


.6 


148849 


9 


AL158837 


AL158837 Human DNA 


c 


27 


32 


. 4 


7 


.6 


165245 


2 


AL451050 


AL451050 Homo sapi 




28 


32 


.2 


7 


.6 


4182 


8 


NEUATPA 


M84191 N.crassa mi 




29 


32 


.2 


7 


.6 


71414 


2 


AC087154 


AC087154 Mus muscu 




30 


32 


.2 


7 


.6 


109047 


2 


HSDJ19F5 


AL078592 Homo sapi 


c 


31 


32 


.2 


7 


.6 


112022 


9 


HSAJ9611 


AJ009611 Homo sapi 


c 


32 


32 


.2 


7 


.6 


143970 


2 


AL360271 


AL360271 Homo sapi 




33 


32 


.2 


7 


.6 


155526 


2 


AC013371 


AC013371 Homo sapi 


c 


34 


32 


.2 


7 


.6 


157493 


2 


AC027068 


AC027068 Homo sapi 


c 


35 


32 


.2 


7 


.6 


163337 


2 


AL445704 


AL445704 Homo sapi 


c 


36 


32 


.2 


7 


.6 


177540 


9 


AC006538 


AC006538 Homo sapi 


c 


37 


32 


.2 


7 


.6 


186510 


9 


HS451B15 


Z98050 Human DNA s 


c 


38 


32 


.2 


7 


.6 


220455 


2 


AC091740 


AC091740 Homo sapi 



39 


32 


7 , 


. 5 


100000 


9 


AP000091 


AP000091 


Homo sapi 


40 


32 


7, 


.5 


100000 


9 


AP000195 


AP000195 


Homo sapi 


41 


32 


7, 


,5 


131375 


2 


AC090120 


AC090120 


Oryza sat 


42 


32 


7, 


.5 


146391 


8 


AC074354 


AC074354 


Genomic S 


43 


32 


7. 


.5 


147771 


2 


AC021860 


AC021860 


Homo sapi 


c 44 


32 


7 , 


,5 


159121 


9 


AP000236 


AP000236 


Homo sapi 


c 45 


32 


7. 


,5 


161002 


9 


AL445664 


AL445664 


Human DNA 



ALIGNMENTS 



RESULT 1 

AP000643 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 



COMMENT 



AP000643 116841 bp DNA HTG 30-MAY-2000 

Homo sapiens chromosome 11 clone CMB9-67M21 map llq22, WORKING 
DRAFT SEQUENCE, 15 unordered pieces. 
AP000643 

AP000643.2 GI:8118835 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

Homo sapiens DNA, clone : CMB9-67M21 . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 116841) 

Hattori,M., Ishii,K., Toyoda,A., Taylor, T.D., Hong-Seog, P . , 
Fujiyama, A., Yada, T . , 'Totoki, Y . , Watanabe,H. and Sakaki,Y. 
Homo sapiens 116,841 genomic DNA of llq22 
Published Only in DataBase (1999) In press 

2 (bases 1 to 116841) 

Hattori,M., Ishii,K., Toyoda,A., Taylor, T.D., Hong-Seog, P . , 
Fujiyama, A., Yada,T., Totoki,Y., Watanabe,H. and Sakaki,Y. 
Direct Submission 

Submitted ( 28-OCT-l 999 ) Masahira Hattori, The Institute of Physical 

and Chemical Research (RIKEN) , Genomic Sciences Center (GSC) ; 

Kitasato Univ., 1-15-1 Kitasato, Sagamihara, Kanagawa 228-8555, 

Japan (E-mail : hattori@gsc . riken . go. jp, 

URL : http : //hgp. gsc.riken.go.jp/, Tel : 8 1-42-778-9923 , 

Fax:81-42-778-9924) 

On May 31, 2000 this sequence version replaced gi: 6997520. 
Genome Center 

Center: RIKEN Genomic Sciences Center (GSC) 

Center code: RIKEN 

Web site: http://hgp.gsc.riken.go.jp/ 

Contact : hattori@gsc. riken . go. jp 
Project Information 

Center project name: HumDraftll 

Center clone name: CMB9-67M21 
Summary Statistics 

Sequencing vector: PCR products; 100% of reads 

Chemistry: Dye-terminator ET-amersham; 100% of reads 

Assembly program: Phrap; version 0.990329 

Consensus quality: 102653 bases at least Q40 

Consensus quality: 109653 bases at least Q30 

Consensus quality: 113592 bases at least Q20 

Insert size: 115441; sum-of-contigs 

Quality coverage: 4.10x in Q20 bases; sum-of-contigs 



NOTE: This is a 'working draft 1 sequence. It currently consists of 
15 contigs. The true order of the pieces is not known and their 
order in this sequence record is arbitrary. Gaps between the 
contigs are represented as runs N, but the exact sizes of the gaps 
are unknown. This record will be updated with the finished sequence 
as soon as it is available and the accession number will be 
preserved 



1 
1 


1 / Z i O 


cont ig 


o t 


1 791 Q 
1 / Z 1 o 


Dp 


in 


length 


17319 


27299 


contig 


of 


QQQ1 

y y o i 


Dp 


in 


length 


27400 


37805 


contig 


of 


10406 


bp 


in 


length 


37906 


48937 


contig 


of 


11032 


bp 


in 


length 


49038 


58240 


contig 


of 


9203 


bp 


in 


length 


58341 


68900 


contig 


of 


10560 


bp 


in 


length 


69001 


77086 


contig 


of 


8086 


bp 


in 


length 


77187 


85247 


contig 


of 


8061 


bp 


in 


length 


85348 


93093 


contig 


of 


7746 


bp 


in 


length 


93194 


99197 


contig 


of 


6004 


bp 


in 


length 


99298 


104114 


contig 


of 


4817 


bp 


in 


length 


104215 


109568 


contig 


of 


5354 


bp 


in 


length 


109669 


112417 


contig 


of 


2749 


bp 


in 


length 


112518 


114990 


contig 


of 


2473 


bp 


in 


length 


115091 


116841 


contig 


of 


1751 


bp 


in 


length 


Sequence updated (26- 


-May-2000) . 










* NOTE: This 


is a 'working draft 1 sequence. 


It currently 


* consists of 


15 contigs. The true order 


of 


the pieces 



is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N f but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

of 17218 bp in length 

100 bp 
of 9981 bp in length 

100 bp 
of 10406 bp in length 
100 bp 



1 17218: contig 

17219 17318: gap of 

17319 27299: contig 

27300 27399: gap of 

27400 37805: contig 

37806 37905: gap of 

37906 48937: contig of 11032 

48938 49037: gap of 

49038 58240: contig 

58241 58340: gap of 

58341 68900: contig 

68901 69000: gap of 

69001 77086: contig 

77087 77186: gap of 

77187 85247: contig of 8061 bp 

85248 85347: gap of 100 bp 

85348 93093: contig of 7746 bp 

93094 93193: gap of 100 bp 

93194 99197: contig of 6004 bp 

99198 99297: gap of 100 bp 

99298 104114: contig of 4817 bp 

104115 104214: gap of 100 bp 

104215 109568: contig of 5354 bp 

109569 109668: gap of 100 bp 

109669 112417: contig of 2749 bp 



bp in length 

100 bp 
of 9203 bp in length 

100 bp 
of 10560 bp in length 
100 bp 



of 8086 bp 
100 bp 



in length 
in length 
in length 
in length 
in length 
in length 
in length 



11 
11 
11 
11 



FEATURES 

source 



misc 


feature 


misc_ 


feature 


misc 


feature 


misc_ 


feature 


misc_ 


feature 


misc_ 


feature 


misc_ 


feature 


misc 


feature 


misc 


feature 


misc_ 


feature 


misc 


feature 


misc_ 


feature 


misc_ 


feature 


misc_ 


feature 


misc 


feature 



BASE COUNT 3569 
ORIGIN 



2418 112517: gap of 100 bp 

2518 114990: contig of 2473 bp in length 

4991 115090: gap of 100 bp 

5091 116841: contig of 1751 bp in length 

Location /Qualifiers 

1. .116841 

/organism="Homo sapiens" 
/db_xref-"taxon : 9606" 
/ ch r omo s ome= " 11" 
/map="llq22" 
/clone-"CMB9-67M21" 
1. .17218 
/note=" 
17319. 
/note=" 
27400. 
/note-" 
37906 
/note=" 
49038 
/note= fl 
58341 
/note=" 
69001 
/note=" 
77187. 
/note=' 
85348. 
/note=" 
93194 . 
/note=" 
99298 
/note= r 
104215. 
/note=" 
109669. 
/note=" 
112518 . 
/note= fl 
115091. 
/note=" 

4 a 22170 c 21077 g 36500 t 



'assembly_f ragment" 
.27299 

'assembly_f ragment clone_end: SP6 vector_side : lef t" 

37805 
' as sembly_f ragment " 

48937 
' as sembly_f ragment " 

58240 
' as sembl y_f ragment " 
68900 

'assembly_f ragment clone_end:T7 vector_side : right " 

77086 
1 as sembly_f ragment" 

85247 
1 as sembl y_f ragment" 

.93093 
' as sembly_f ragment" 

.99197 
1 as sembl y_f ragment" 

104114 
' as sembl y_f ragment" 

. 109568 
as sembl y_f ragment" 

.112417 
as sembl y_f ragment" 
- .114990 

as sembl y_f ragment" 

.116841 
as sembly_f ragment " 

1400 others 



Query Match 8.4%; Score 35.6; DB 2; Length 116841; 

Best Local Similarity 60.2%; Pred. No. 5.3; 

Matches 59; Conservative 0; Mismatches 39; Indels 0; Gaps 0; 

Qy 59 tgtgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacac 118 

! I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I II II 
Db 27788 TGTGCTCTACAGATGCCTGCTTCCCTTTCACTTCTCTGCTATGTTATGACACAGCAAGAG 2784 7 

Qy 119 cacctcggcacaagcaggaaggagtggctacaactcgt 156 

I I I I II III I I I I I I I I I I I I 

Db 2784 8 GCCCTCATCAGGAGCTAACCAGATTGGCCACCAGTCTT 27 885 



RESULT 2 
AC079888/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AC079888 153084 bp DNA HTG 16-MAR-2001 

Oryza sativa chromosome 10 clone OS JNBaO 07 8001 , *** SEQUENCING IN 
PROGRESS 4 unordered pieces. 

AC079888 

AC079888.7 GI:13357270 
HTG; HTGS_PHASE1. 
Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta ; Spermatophyta ; 
Magnoliophyta ; Liliopsida; Poales; Poaceae; Ehrhartoideae ; Oryzeae; 
Oryza . 

1 (bases 1 to 153084) 

Buell,R., Hsiao, J. , Zismann,V., Moffat, K.M.-, Hill, J., 
Gansberger , K . , Burgess, S., Jarrahi,B., Shvartsbeyn, M . , Brenner, M., 
Ciecko,A., Pai,G., Vanaken,S., Hansen, C, Utterbach, T . , 
Feldblyum, T . , Khalak,H.G., Yuan,Q., Quackenbush, J . , White, O., 
Salzberg,S. and Fraser,C. 

Oryza sativa ssp. japonica cv. Nipponbare OS JNBa0078O01 BAC genomic 

sequence 

Unpublished 

2 (bases 1 to 153084) 
Buell,R. 

Direct Submission 

Submitted ( 16-SEP-2000) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 

On Mar 16, 2001 this sequence version replaced gi:12039424. 

* NOTE: This is a 1 working draft 1 sequence. It currently 

* consists of 4 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 58321: contig of 58321 bp in length 

* 58322 58355: gap of unknown length 

* 58356 60656: contig of 2301 bp in length 

* 60657 60689: gap of unknown length 

* 60690 128049: contig of 67360 bp in length 

* 128050 128082: gap of unknown length 

* 128083 153084: contig of 25002 bp in length. 

Location/Qualifiers 
1. .153084 

/organism="Oryza sativa" 
/ cultivar="Nipponbare" 
/ sub_species=" japonica" 
/db_xref="taxon: 4530" 
/chromosome="10" 
/clone="OS38-OSJNBa007 8O01" 
/clone-"OSJNBa0078O01" 
44125 a 33372 c 33060 g 42426 t 101 others 



Query Match 8.4%; Score 35.6; DB 2; Length 153084; 

Best Local Similarity 47.7%; Pred. No. 5.2; 

Matches 104; Conservative 0; Mismatches 114; Indels 0; Gaps 



0; 



Qy 23 ggagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggc 82 

I I I I I I III III I I I I I I III I I I I I I I II 

Db 83983 GGCGTGGGTCATCCGCAAGGTGCACCTCGAGTCGCCCGACCTCGCCGTCGGCCTCCTCGG 83924 

Qy 83 ccttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggag 142 

I I I I I I I I III II II I I I I III III 

Db 83923 CCTCGTCGCGTCCTGCCTCGGCACGGTCATGGAGGCGGAGATGGACCGGATCAAACGCAA 83864 

Qy 143 tggctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacgg 202 

I II I I I I I II I II I I I I I I I I III II I I I 
Db 838 63 GAACGTCGAGCCGTCCGCGTCGGTGGCGGCGGCGGCCAGCAACGCTGCCCCCGACAACGA 83804 

Qy 203 cggtggcagcccctagctaggcggtggatccgagcctg 240 

I I I I I I I I I I I II I I I I I II 
Db 83803 CGGCGGCGACACCGACCAGATCGAGGACGCCGACGCCG 83766 



RESULT 3 
AF238279/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



AF238279 114525 bp DNA HTG 08-JUN-2001 

Homo sapiens chromosome 8 clone RP5-1127D12 map 8p, WORKING DRAFT 
SEQUENCE, 26 unordered pieces. 
AF238279 

AF238279.3 GI:14329033 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 114525) 

Schilhabel,M.B, , Baumgart,C, Blechschmidt , K. , Dette,M., Jahn,N., 
Lehmann,R., Menzel,U., Polley,A., Reichwald, K. , Schudy,A., 
Siddiqui,R., Taudien,S., Wen,G., Siebert,R., Schlegelberger , B . , 
Rosenthal, A. and Platzer,M. 
Chromosome 8 genomic sequence 
Unpublished 

2 (bases 1 to 114525) 
Genome Sequencing Center Jena. 
Direct Submission 

Submitted ( 24-FEB-2000 ) Genome Analysis, Institute of Molecular 
Biotechnology, Beutenberstr . 11, Jena 07745, Germany 
On Jun 8, 2001 this sequence version replaced gi: 8151654. 
Genome Center 

Center: Insitute of Molecular Biotechnoloy 

Center code: 1MB 

Web site: http://genome.imb-jena.de/ 

Contact : gsc j -submit@genome . imb- j ena . de 
Project Information 

Center project name: H405 

Center clone name: RP5-1127D12 
Summary Statistics 

Sequencing vector: M13; 100% of reads 

Chemistry: Dye-terminator Big Dye; 100% of reads 



Assembly program: Phrap; version 0.990329 
Consensus quality: 88803 bases at least Q40 
Consensus quality: 97704 bases at least Q30 
Consensus quality: 104292 bases at least Q20 
Quality coverage: 3.05 x in Q20 bases; sum-of -contigs 



Sequence Quality Assessment: 

This entry has been annotated with sequence quality 
estimates computed by the Phrap assembly program. 
All manually edited bases have been reduced to quality 10. 
Quality levels above 40 are expected to have less than 
1 error in 10, 000 bp. 

Base-by-base quality values are not generally visible from the 
GenBank flat file format but are available as part 
of this entry f s ASN.l file. 



* NOTE: This is a 'working draft' sequence. It currently 

* consists of 26 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 



■A- 


1 


1583 


: contig 


of 1583 


bp in length 




1584 


1683 


: gap of 


unknown 


length 




. 1684 


2733 


contig 


of 1050 


bp in length 


* 


2734 


2833 


gap of 


unknown 


length 




2834 


4191 


contig 


of 1358 


bp in length 




4192 


4291 


gap of 


unknown 


length 


* 


4292 


5780 


contig 


of 1489 


bp in length 


* 


5781 


5880 


gap of 


unknown 


length 


* 


5881 


6740 


contig 


of 860 bp in length 


* 


6741 


6840 


gap of 


unknown 


length 




6841 


7922 


contig 


of 1082 


bp in length 




7923 


8022 


gap of 


unknown 


length 




8023 


9862 


contig 


of 1840 


bp in length 


★ 


9863 


9962 


gap of 


unknown 


length 


* 


9963 


11563 


contig 


of 1601 


bp in length 




11564 


11663 


gap of 


unknown 


length 




11664 


12833 


contig 


of 1170 


bp in length 


* 


12834 


12933 


gap of 


unknown 


length 


* 


12934 


14109 


contig 


of 1176 


bp in length 


* 


14110 


14209 


■ gap of 


unknown 


length 


* 


14210 


20044 


contig 


of 5835 


bp in length 




20045 


20144 


gap of 


unknown 


length 


* 


20145 


23232 


contig 


of 3088 


bp in length 




23233 


23332 


gap of 


unknown 


length 


* 


23333 


27722 


contig 


of 4390 


bp in length 




27723 


27822 


gap of 


unknown 


length 




27823 


31485 


contig 


of 3663 


bp in length 




31486 


31585 


gap of 


unknown 


length 


* 


31586 


35150 


contig 


of 3565 


bp in length 




35151 


35250 


gap of 


unknown 


length 




35251 


40074 


contig 


of 4824 


bp in length 


* 


40075 


40174 


gap of 


unknown 


length 




40175 


44444 


contig 


of 4270 


bp in length 



FEATURES 

source 



44445 
44545 
51428 
51528 
57080 
57180 
64711 
64811 
71528 
71628 
79633 
79733 
88289 
88389 
97054 
97154 
106654 
106754 



BASE COUNT 
ORIGIN 



32955 



44544 
51427 
51527 
57079 
57179 
64710 
64810 
71527 
71627 
79632 
79732 
88288 
88388 
97053 
97153 
106653 
106753 
114525 
Location/Qu 
1. .114525 
/organism=" 
/db_xref="t 
/chromosome 
/map="8p" 
/clone="RP5 
a 23434 c 



gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
gap of 
contig 
alif iers 



unknown 
of 6883 
unknown 
of 5552 
unknown 
of 7531 
unknown 
of 6717 
unknown 
of 8005 
unknown 
of 8556 
unknown 
of 8665 
unknown 
of 9500 
unknown 
of 7772 



length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 



length 
length 
length 
length 
length 
length 
length 
length 
length , 



Homo sapiens" 
axon: 9606" 
"8" 

-1127D12" 

23679 g 31957 t 



2500 others 



Query Match 8.3%; Score 35.2; DB 2; Length 114525; 

Best Local Similarity 50.6%; Pred. No. 7; 

Matches 85; Conservative 0; Mismatches 83; Indels 0; Gaps 

Qy 100 tggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaactcgtacg 159 

Mill! Ill III II I I I I IN II I I I I I I II 

Db 24 677 TGGTCAGACACCTCTGAAACATGGGTGAATTATCAGAGAGGCGTCCCTACAATGATTAAA 24 618 

Qy 160 aacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccctagc 219 

III I I I I I I I I I I I I 1 I I I I I III I 

Db 24 617 CACCAAGGGAAGGCTGCCTTCCCTAGTCCGTGACTGGCACCGGAGTTTTGGGTCCACGGA 24 558 

Qy 220 taggcggtggatccgagcctgtatcagaaatcgaaataatataagact 2 67 

II III III I I I I I I I I I I I I I I I I I I I I I 
Db 24557 TAAAACGTGTCTCCTTGTCTCTACCAGAAAATGAAAGAAATTGAAATT 24510 



RESULT 4 
AC004618/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AC004618 200000 bp DNA HTG 04-DEC-1998 

Homo sapiens chromosome 4, *** SEQUENCING IN PROGRESS ***, 24 
unordered pieces. 
AC004618 

AC004618.1 GI:3962501 
HTG; HTGS__PHASE1 . 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 200000) 

AUTHORS Stone, N.E., Schmutz , J. J. , Cox,D.R. and Myers, R.M. 

TITLE Direct Submission 

JOURNAL Unpublished 
REFERENCE 2 (bases 1 to 200000) 

AUTHORS Stone, N.E., Schmutz, J. J. , Cox,D.R. and Myers,R.M. 

TITLE Direct Submission 

JOURNAL Submitted ( 27-APR-l 998 ) Department of Genetics, Stanford Human 
Genome Center, 855 California Avenue, Palo Alto, CA 94304, USA 
COMMENT On Dec 4, 1998 this sequence version replaced gi: 3927817. 

* NOTE: This is a 'working draft' sequence. It currently 

* consists of 24 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 





be preserved. 
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1111 


: contig 


of 1111 


bp m length 


•k 


1119 
1 1 1 Z 


9 Q1 9 


: gap of 


unknown 


length 


•k 


z y i o 


4 yuu 


: contig 


or lybo 


Dp m length 




A Q C\ 1 

4 y u i 


£ i n i 
D / U 1 


: gap of 


unknown 


length 




O / UZ 


Q fl Q Q 


: contig 


of 1387 


bp in length 




QAQQ 

o uo y 


ybcf y 


■ gap of 


unknown 


length 




y o y u 


1114 0 


contig 


of 1256 


bp in length 


■A- 


1 1 1 A C 

1114b 


1 OQyl C 

iz y 4 d 


gap of 


unknown 


length 




1 9 Q A 1 

i z y 4 / 


1 / 1 9 Q 
1 4 IZO 


contig 


of 1182 


bp in length 




1/100 

1 4 1 z y 


1 CQOQ 

i o yz y 


gap of 


unknown 


length 






1 O A 1 H 
1 0 4 1 / 


contig 


of 2488 


bp in length 




T O A 1 O 

1 o 4 1 o 


zUzl / 


gap of 


unknown 


length 




z u Z 1 o 


9 9 97ft 
Z Z Z / O 


contig 


of 2061 


bp in length 


•k 


22279 


24078 


gap of 


unknown 


length 




24079 


26112 


contig 


of 2034 


bp in length 




26113 


27912 


gap of 


unknown 


length 


* 


27913 


31852 


contig 


of 3940 


bp in length 




31853 


33652 


gap of 


unknown 


length 


* 


33653 


37450 


contig 


of 3798 


bp in length 


* 


37451 


39250 


gap of 


unknown 


length 




39251 


43441 


contig 


of 4191 


bp in length 


* 


43442 


45241 


gap of 


unknown 


length 


* 


45242 


51701 


contig 


of 6460 


bp in length 




51702 


53501 


gap of 


unknown 


length 


* 


53502 


58600 


contig 


of 5099 


bp in length 


* 


58601 


60400 


gap of 


unknown 


length 




60401 


64575 


contig 


of 4175 


bp in length 


* 


64576 


66375 


gap of 


unknown 


length 


* 


66376 


72514 


contig 


of 6139 


bp in length 




72515 


74314 


gap of 


unknown 


length 


* 


74315 


82110. 


contig 


of 7796 


bp in length 


* 


82111 


83910- 


gap of 


unknown 


length 


* 


83911 


90792: 


contig 


of 6882 


bp in length 




90793 


92592: 


gap of 


unknown 


length 


* 


92593 


99418: 


contig 


of 6826 


bp in length 


* 


99419 


101218: 


gap of 


unknown 


length 


+ 


101219 


109648: 


contig 


of 8430 


bp in length 


★ 


109649 


111448: 


gap of 


unknown 


length 



FEATURES 

source 



BASE COUNT 
ORIGIN 



111449 
121403 
123203 
137697 
139497 
161239 
163039 
178292 
180092 



121402 
123202 
137696 
139496 
161238 
163038 
178291 
180091 
200000 
Location/Qualif i 
1. .200000 
/organism="Homo 
/db_xref ="taxon : 
/chromosome="4 " 
38926 a 40001 c 39609 



cont 

gap 

cont 

gap 

cont 

gap 

cont 

gap 

cont 



ig of 9954 bp in length 

of unknown length 

ig of 14494 bp in length 

of unknown length 

ig of 21742 bp in length 

of unknown length 

ig of 15253 bp in length 

of unknown length 

ig of 19909 bp in length. 

ers 



sapiens 1 
9606" 



40039 t 41425 others 



Query Match 8.2%; Score 34.8; DB 2; Length 200000; 

Best Local Similarity 46.6%; Pred. No. 8.9; 

Matches 111; Conservative 0; Mismatches 127; Indels 0; Gaps 0; 

Qy 61 tgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacacca 120 

I I I I I I I I 1 I I II III II II I I I I I I I I I II 

Db 667 64 TGCTCTGCATGATGCTGCTGCTTCTTTCCCTTCCTCGTCAATCAGGAGTCATGAAGCCCT 667 05 



Qy 121 cctcggcacaagcaggaaggagtggctacaactcgtacgaacctgatggaaggggtggat 180 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 66704 CAGCTCCCCATGTGCTCAGTAGCACATGAAACTCACACGCAACATCGAACCTTGGAAGAC 66645 

Qy 181 acaactctgttcccatcaacggcggtggcagcccctagctaggcggtggatccgagcctg 240 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 6664 4 ATTGAGGTTTTCAGACCTACAACTGGTACAGCTTCTAGGCCCAATGCTCCAATGAGCCGC 66585 

Qy 241 tatcagaaatcgaaataatataagactgtcttcaacggatcacactgccgctccccca 2 98 

I I I I I I I II I I I I I I I I I I I I I I II I i I II 

Db 66584 CATCACAAAAGGTAGCAAAAAAGCAAACCATTCAGCGGATGCCTCCCCGTAACCAGCA 66527 



RESULT 5 
AC004624/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



AC004624 200000 bp DNA HTG 
Homo sapiens chromosome 4, *** SEQUENCING IN 
unordered pieces. 
AC004624 

AC004624.6 GI:5706769 
HTG; HTGS_PHASE1. 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 200000) 
Stone, N.E., Schmut z, J . J . , Cox,D.R 
Direct Submission 
Unpublished 

2 (bases 1 to 200000) 

Stone, N.E., Schmutz, J. J. , Cox,D.R. and Myers, R.M 



03-SEP-1999 
PROGRESS *** 2 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



and Myers, R.M. 



TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Direct Submission 

Submitted (28-APR-1998 ) Department of Genetics, Stanford Human 
Genome Center, 855 California Avenue, Palo Alto, CA 94304, USA 
On Aug 6, 1999 this sequence version replaced gi: 5705982.. 

* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 2 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 71732: contig of 71732 bp in length 

* 71733 96430: gap of unknown length 

* 96431 200000: contig of 103570 bp in length. 

Location/Qualifiers 
1. .200000 

/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/chromosome="4 " 
44599 a 42806 c 44180 g 43717 t 24698 others 



Query Match 8.2%; Score 34.8; DB 2; Length 200000; 

Best Local Similarity 46.6%; Pred. No. 8.9; 

Matches 111; Conservative 0; Mismatches 127; Indels 0; Gaps 0; 

Qy 61 tgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacacca 120 

I I I I I I I I 11 I I I III II I I I I I II I I I I II 

Db 44261 TGCTCTGCATGATGCTGCTGCTTCTTTCCCTTCCTCGTCAATCAGGAGTCATGAAGCCCT 44202 

Qy 121 cctcggcacaagcaggaaggagtggctacaactcgtacgaacctgatggaaggggtggat 180 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 4 201 CAGCTCCCCATGTGCTCAGTAGCACATGAAACTCACACGCAACATCGAACCTTGGAAGAC 44142 

Qy 181 acaactctgttcccatcaacggcggtggcagcccctagctaggcggtggatccgagcctg 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 4141 ATTGAGGTTTTCAGACCTACAACTGGTACAGCTTCTAGGCCCAATGCTCCAATGAGCCGC 44082 

Qy 241 tatcagaaatcgaaataatataagactgtcttcaacggatcacactgccgctccccca 298 

I I I I I I I II I I I I I I I I I I I I I I II I II II 

Db 44081 CATCACAAAAGGTAGCAAAAAAGCAAACCATTCAGCGGATGCCTCCCCGTAACCAGCA 44024 



RESULT 6 

AC079152 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AC079152 165909 bp DNA HTG 
Homo sapiens chromosome UNK clone RP11-181J6, 
PROGRESS ***, 33 unordered pieces. 
AC079152 

AC079152.1 GI:9858435 
HTG; HTGS_PHASE1. 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 



20-AUG-2000 
SEQUENCING IN 



Craniata; Vertebrata ; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



1 (bases 1 to 165909) 
Waterston, R.H. 

The sequence of Homo sapiens clone 
Unpublished 

2 (bases 1 to 165909) 
Waterston, R.H. 
Direct Submission 

Submitted (20-AUG-2000) Genome Sequencing Center, Washington 
University School of Medicine, 4444 Forest Park Parkway, St. Louis, 
MO 63108, USA 

Genome Center 

Center: Washington University Genome Sequencing Center 
Web site : http : //genome . wustl . edu/gsc/ index . shtml 
Project Information 

* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 33 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

bp in length 
length 

bp in length 
length, 
bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length 





1 
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• contig 


of 1350 


•k 


1351 


1450 


gap of 


unknown 




1451 


3921 


contig 


of 2471 


* 


3922 


4021 


gap of 


unknown 


* 


4022 
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contig 


of 2305 


* 


6327 
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gap of 


unknown 




6427 


9396 


contig 


of 2970 


* 


9397 
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gap of 


unknown 


★ 
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contig 


of 2830 
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gap of 


unknown 




12427 


14557 


contig 


of 2131 


★ 


14558 
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gap of 


unknown 


* 


14658 
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contig 


of 2700 


★ 
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gap of 


unknown 


★ 


17458 
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contig 


of 2256 


* 


19714 
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gap of 


unknown 
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23293 


contig 


of 3480 
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gap of 


unknown 
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contig 


of 3657 
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gap of 


unknown 


★ 


27151 
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contig 


of 3835 


★ 


30986 
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gap of 


unknown 
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contig 
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* 
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gap of 


unknown 


★ 


35124 
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contig 


of 3265 


* 
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gap of 


unknown 


★ 
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contig 
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* 
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gap of 
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★ 
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contig 


of 4166 
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gap of 
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contig 


of 3920 


★ 
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gap of 


unknown 


★ 


50435 


57053 


contig 


of 6619 



FEATURES 

source 
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: contig 
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bp in length 


•k 
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: gap of 
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length 
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: contig 
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bp in length 
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gap of 


unknown 
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contig 
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bp in length 
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1 U 4 /DO 


gap of 


unknown 


length 


*k 
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contig 


or bool 


bp in length 


~k 
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gap of 


unknown 


1 — _ .— 4- 1— 

length 


•k 
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contig 


of 5624 


bp in length 




*i i c o i n 

1 1 do 1 y 


1 1 bylo 


gap of 


unknown 


length 


k 


1 1 d y i y 


1 O C O H 1 

1 Z bo / I 


contig 


of 9953 


bp in length 


•k 


1 O C O *7 O 
1 Z DO / Z 


i o c m i 

iz by /i 


gap of 


unknown 


length 


-k 


i z b y / Z 


1 o b y U U 


contig 


of 9929 


bp in length 


-k 


136901 


137000 


gap of 


unknown 


length 




137001 


147497 


contig 


of 10497 bp in length 


•k 


147498 


147597 


gap of 


unknown 


length 


k 


147598 


163433 


contig 


of 15836 bp in length 


k 


163434 


163533 


gap of 


unknown 


length 


k 


163534 


165909 


contig 


of 2376 


bp in length. 




Location/Qualifiers 






1 . 


.165909 









misc_f eature 
misc_f eature 
misc_f eature 
misc_f eature 
misc_f eature 
misc_f eature 
misc_f eature 
misc_f eature 
misc feature 



/organism^"Homo sapiens" 
/db_xref="taxon: 9606" 
/chromosome="UNK" 
/clone= n RPll-181J6" 
1. .1350 

/note="assembly_name :ContiglO" 
1451. .3921 

/note="assembly_name rContigll" 
4022. .6326 

/note="assembly_name :Contigl2" 
6427. .9396 

/note="assembly_name : Contigl3" 
9497. .12326 

/note="assembly_name : Contigl4 " 
12427. .14557 

/note="assembly_name : ContiglS" 
14658. .17357 

/note="assembly_name : Contigl6" 
17458. .19713 

/note="assembly_name : Contigl7" 
19814. .23293 

/note="assembly_name : Contigl8 
clone end:SP6 



vector_side : lef t " 
misc_feature 23394. .27050 

/note="assembly_name :Contigl9" 
misc_feature 27151. .30985 

/note="assembly_name :Contig20" 
misc_feature 31086. .35023 

/note="assembly_name :Contig21 " 
misc_feature 35124. .38388 

/note="assembly_name : Contig22 " 
misc_feature 38489. .42048 

/note="assembly_naine : Contig23" 
misc_feature 42149. .46314 

/note="assembly__name : Contig24 " 
misc_f eature 46415. .50334 

/note="assembly_name :Contig25" 
mis cofeature 50435. .57053 

/note="assembly_name : Contig26" 
misc_feature 57154. .61080 

/note="assembly_name : Contig27 " 
misc_f eature 61181. .65562 

/note="assembly_name :Contig28 " 
misc_feature 65663. .69259 

/note="assembly_name :Contig2 9" 
misc_feature 69360. .73828 

/note="assembly__name : Contig30" 
misc_feature 73929. .79818 

/note="assembly_name : Contig31" 
misc_feature 79919. .84908 

/note= n assembly_name : Contig32 " 
misc_feature 85009. .90835 

/note="assembly_name : Contig33" 
misc_f eature 90936. .97723 

/note="assembly_name : Contig34 " 
misc_feature 97824. .104663 

/note="assembly_name : Contig35" 
misc_feature 104764. .111094 

/note="assembly_name : Contig36" 
misc_feature 111195. .116818 

/note="assembly_name : Contig37 " 
misc_feature 116919. .126871 

/note="assembly_name : Contig38 " 
misc_feature 126972. .136900 

/note="assembly_name : Contig39" 
misc_feature 137001. .147497 

/note="assembly_name :Contig4 0" 
misc_feature 147598. .163433 

/note="assembly_name :Contig41" 
misc_feature 163534. .165909 

/note="assembly_name : Con tig 9" 
BASE COUNT 49077 a 31362 c 32471 g 49718 t 3281 others 
ORIGIN 



Query Match 8.1%; Score 34.4; DB 2; Length 165909; 

Best Local Similarity 32.2%; Pred. No. 12; 

Matches 98; Conservative 0; Mismatches 206; Indels 0; Gaps 



0; 



Qy 98 tatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaactcgta 157 

II III I I I I -I I I I I I I I I I I I I II I III 

Db 46157 TACCGGCGGGCCCAGGGAAAGCTGACACGGGCAGGCGGGAAGGGAGCACTCGACGGCGGA 46216 

Qy 158 cgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagccccta 217 

II I I I I I I I I I I I I I II I I I I I I I I 
Db 4 6217 CCGATGGGAAAGAAGACGCGCTTGCACCGCCAGACGCGCGACACGAAGGGAGAAGCACTC 4 627 6 

Qy 218 gctaggcggtggatccgagcctgtatcagaaatcgaaataatataagactgtcttcaacg 277 

II MM' I II I I I I I I I I I I I 
Db 4 6277 ACTTTTCGATGTTGACGAGGCTCTCCTGGCAGTGGAAANNNNNNNNNNNNNNNNNNNNNN 46336 

Qy 278 gatcacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgatt 337 

Db 4 6337 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 4 6396 

Qy 338 aacggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccc 397 

III II I M M I I I I I I I I I II M 

Db 4 6397 NNNNNNNNNNNNNNNNNNCCCCTTCTCCCTTCCCCCCACCCTCCCCACCTCCTCCCTCCC 4 64 56 

Qy 398 cctc 401 
I I 

Db 46457 TCCC 46460 



RESULT 7 

LMFLCHR34_05/c 

WPCOMMENT 

Sequence split into 18 fragments LOCUS LMFLCHR34 Accession AL499623 



Fragment 


Name 


Begin 


End 


LMFLCHR34 


00 


1 


110000 


LMFLCHR34 


01 


100001 


210000 


LMFLCHR34 


02 


200001 


310000 


LMFLCHR34 


03 


300001 


410000 


LMFLCHR34 


04 


400001 


510000 


LMFLCHR34 


05 


500001 


610000 


LMFLCHR34 


06 


600001 


710000 


LMFLCHR34 


07 


700001 


810000 


LMFLCHR34 


08 


800001 


910000 


LMFLCHR34 


09 


900001 


1010000 


LMFLCHR34 


10 


1000001 


1110000 


LMFLCHR34 


11 


1100001 


1210000 


LMFLCHR34 


12 


1200001 


1310000 


LMFLCHR34 


13 


1300001 


1410000 


LMFLCHR34 


14 


1400001 


1510000 


LMFLCHR34 


15 


1500001 


1610000 


LMFLCHR34 


16 


1600001 


1710000 


LMFLCHR34 


17 


1700001 


1720777 



Continuation (6 of 18) of LMFLCHR34 from base 500001 (AL499623 Leishmania major 
chromosome 34 clone Chr.34 strain Friedlin, *** SEQUENCING IN PROGRESS ***, in 
ordered pieces. 5/2001) 



Query Match 8.0%; Score 34.2; DB 2; Length 110000; 

Best Local Similarity 50.3%; Pred. No. 14; 

Matches 84; Conservative 0; Mismatches 83; Indels 0; Gaps 



0; 



Qy 216 tagctaggcggtggatccgagcctgtatcagaaatcgaaataatataagactgtcttcaa 275 

I I I I III II I I I I I I I I I I I I I II I I I I I 

Db 12375 TCGCTTCGCGCAGCAGTGCGGCCTGTAACCGCCGTCGGAGCCATCGCTTCCACCGGTCAA 12316 

Qy 27 6 cggatcacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccga 335 

I I I I I I I I I I III I I I I I I III I I I I I 

Db 12315 AGCACCGCTCTGCCAGGTGCAGAAGCGCCTCGTGGCGGATGAGTCGCATGCTAGCCCCGA 12256 

Qy 336 ttaacggctcacgctaccaggcgctctacgcggatgtgccccctagc 382 

I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 12255 TGAGCGCACGACGCTGCCACGCGCGAAATGCCGCCGTCGCCACTTGC 12209 



RESULT 8 

AC008676 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



AC008676 207418 bp . DNA HTG 20-APR-2001 

Homo sapiens chromosome 5 clone CTB-47B11, WORKING DRAFT SEQUENCE, 
8 unordered pieces . 
AC008676 

AC008676.5 GI:13699408 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_ACTIVEFIN . 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 207418) 

DOE Joint Genome Institute. 
Sequencing of Human Chromosome 5 
Unpublished 

2 (bases 1 to 207418) 

DOE Joint Genome Institute. 
Direct Submission 

Submitted ( 03-AUG-l 999 ) Production Sequencing Facility, DOE Joint 
Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA 
On Apr 20, 2001 this sequence version replaced gi:7709257. 

Genome Center 

Center: Joint Genome Institute 
Center Code: JGI 

Web site: http://www.jgi.doe.gov 

Project Information 

Center Project Name: 82287, H304 

Center clone name: CIT978SKB_47B11 

Summary Statistics 

Consensus quality: 198468 bases at least Q40 

Consensus quality: 203888 bases at least Q30 

Consensus quality: 205235 bases at least Q20 

Estimated insert size: 213000; pulse field gel estimation 

Estimated insert size: 206718; sum-of-cont igs estimation 

Quality coverage: 8.49 in Q20 bases; pulse field gel estimation 

Quality coverage: 8.75 in Q20 bases; sum-of-contigs estimation. 

* NOTE: This is a 1 working draft 1 sequence. It currently 

* consists of 8 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 



* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 



contig of 1010 bp in length 
gap of unknown length 
contig of 3591 bp in length 
gap of unknown length 
contig of 14320 bp in length 
gap of unknown length 
contig of 29960 bp in length 
gap of unknown length 
contig of 36197 bp in length 
gap of unknown length 

of 36512 bp in length 
unknown length 
of 38203 bp in length 
unknown length 
of 46925 bp in length. 





be preserved. 




1 


1U10 : 




1011 


1110: 




1 1 1 T 
1111 


A 1 A 1 . 

4 /Ul : 




A T A O 


A O A 1 . 

4 oul : 




A O A '") 


19121 : 




19122 


19221 : 




19222 


49181: 




49182 


49281: 


* 


49282 


85478: 


★ 


85479 


85578: 




85579 


122090: 


* 


122091 


122190: 




122191 


160393: 


* 


160394 


160493: 


★ 


160494 


207418: 



FEATURES 

source 



BASE COUNT 
ORIGIN 



56983 



contig 
gap of 
contig 
gap of 
contig 
Locat ion /Qualifiers 
1. .207418 

/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/ c h r omo s ome ="5" 
/clone-"CTB-47Bll" 
/clone_lib="CalTech human 
a 45265 c 46330 g 58117 



BAC library B" 
t 723 others 



Query Match 8.0%; Score 34.2; DB 2; Length 207418; 

Best Local Similarity 56.8%; Pred. No. 14; 

Matches 63; Conservative 0; Mismatches 48; Indels 0; Gaps 

Qy 18 tttgaggagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctg 77 

I I I I I I I I I I I I I I I I II I I I I II II I I I I I I I 
Db 25935 TTAGAGGATAAACATTACACCAGGAGGAAGCTGAAATGTCCCTCCCCTGATTTATGGTTA 25994 

Qy 7 8 atggcccttgtcgtagctgctatggtctgtgtcatgtacaccacctcggca 128 

III I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 25995 ATGAGCCTTGTGTTAGGTGCTGAGGATACGCTAATGACCTTTACCTGGGAA 26045 



RESULT 9 

SME591791 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



SME591791 340900 bp DNA BCT 16-AUG-2001 

Sinorhizobium meliloti 1021 complete chromosome; segment 10/12. 
AL591791 AL591688 
AL591791.1 GI:15075538 

Sinorhizobium meliloti. 
Sinorhizobium meliloti 

Bacteria; Proteobacteria; alpha subdivision; Rhizobiaceae group; 
Rhizobiaceae; Sinorhizobium. 
1 (bases 1 to 340900) 

Capela,D., Barloy-Hubler , F . , Gouzy,J., Bothe,G., Ampe,F., Batut, 
Boistard,P., Becker, A., Boutry,M., Cadieu,E., Dreano,S., Gloux, S 
Godrie,T., Goffeau,A., Kahn,D., Kiss,E., Lelaure,V., Masuy,D., 



Pohl,T., Portetelle, D. , Puehler,A., Purnelle,B., Ramsperger , U . , 

Renard,C, Thebault,P., Vandenbol , M . , Weidner,S. and Galibert,F. 
TITLE From the Cover: Analysis of the chromosome sequence of the legume 

symbiont Sinorhizobium meliloti strain 1021 
JOURNAL Proceedings of the National Academy of Sciences of the United 

States of America. 98 (17), 9877-9882 (2001) 
PUBMED 11481430 
REFERENCE 2 (bases 1 to 340900) 
AUTHORS Gouzy, J. 
TITLE Direct Submission 

JOURNAL Submitted ( 2 6- JUL-2001 ) Gouzy J., Submitted on behalf of the MELILO 
EU Consortium 
COMMENT MELILO EU Consortium: 

Laboratoire de Biologie Moleculaire des Relations 

Plantes-Microorganismes, UMR2 1 5-CNRS- INRA, BP27, F-31326 Castanet, 
France, Laboratoire de Genetique et Developpement UMR60 61-CNRS , 
Faculte de Medecine, 2 avenue du Pr. Leon Bernard, F-35043 Rennes, 
France, GATC GmbH, Frit z-Arnold-str . 23, D-78467 Konstanz, Germany, 
Universitaet Bielefeld, Biologie IV (Genetik) Universitaetstr 25, 
D-33615 Bielefeld, Germany, Unite de Biochimie physiologique, 
Universite Catholique de Louvain, Place Croix du Sud 2, Bte 20, 
B-1348 Louvain-la-Neuve, Belgium, Unite de Microbiologie , Faculte 
des Sciences Agronomiques de Gembloux, Avenue Marechal Juin 6, 
B-5030 Gembloux, Belgium. E-mail : Jerome . Gouzy@toulouse . inra . fr 
http : // sequence . toulouse . inra . fr /meliloti . html . 
FEATURES Location/Qualifiers 
source 1. .340900 

/or ganism=" Sinorhizobium meliloti" 

/strain="1021" 

/db_xref="taxon: 382" 
gene 132. .518 

/gene-"SMc01981" 
CDS 132. .518 

/gene="SMc01981" 

/f unction= l? small molecule metabolism; energy transfer; 
electron transport" 

/note="Product confidence : putative 
Gene name confidence : hypothetical 
predicted by Codon_usage 
predicted by Homology 
predicted by FrameD" 
/codon_start=l 
/transl_table=ll 
/evidence=not_experimental 

/product=" PUTATIVE CYTOCHROME C TRANSMEMBRANE PROTEIN" 
/protein_id="CAC47094 .1" 
/db_xref ="GI : 15075539" 

/trans la tion="MRIALAALSFSALSLCPVAGLAQEGDAEAGATVFKKCATCHVID 

KDQNKVGPSLQGVIGRTAGTHADFKYSQAMIDAGKGGLVWDDATLAEYLRNPRAKVKG 

TKMVFPGLKKDEEIANVIAYLKQHPK" 
gene 656. .1600 

/gene="coxM OR SMc01982" 
CDS 656. .1600 

/gene="coxM OR SMc01982" 

/EC_number="1.9.3.1" 

/f unction="small molecule metabolism; energy transfer; 
electron transport" 



/note="Product confidence : probable 
Gene name confidence : probable 
predicted by Codon_usage 
predicted by Homology 
predicted by Frame D" 
/codon_start=l 
/transl_table=ll 
/evidence=not_experimental 

/product=" PROBABLE ALTERNATIVE CYTOCHROME C OXIDASE 
POLYPEPTIDE II TRANSMEMBRANE PROTEIN" 
/protein_id="CAC47095 .1" 
/db_xref="GI : 15075540" 

/translation=" MAVVVILVLLAVGSVLFHLLSPWWWTPIASNWNYIDNTITITFW 
ITGIAFTAVVLFMAYCVLRFRHRPGNTAAYEPENRRLEGWLATGTTFGVAAMLAPGLF 
VWNQFVTVPQDASEVEVIGQQWLWSFRLPGADGKLGTTETRDIAPENTLGVNRDDAAG 
QDDIIIEGGELHLPVGKPVKMLLRSVDVLHDFYVPEFRAKMDMVPGMITYFWLTPTRT 
GTFEILCAELCGVGHPQMRGTVVVDTEEDYQAWLAEQQTFSQLSASSETRAVPEKVCS 
GFPSGIATEQGTGASALFKEERECFGPAATTVAASAAQ" 



/gene="coxN OR SMc01983" 
/EC_number="1.9.3.1" 

/f unction="small molecule metabolism; energy transfer; 
electron transport" 

/note="Product confidence : probable 
Gene name confidence : probable 
predicted by Codon__usage 
predicted by Homology 
predicted by FrameD" 
/codon_start=l 
/trans l_table=ll 
/evidence=not_experimental 

/product=" PROBABLE ALTERNATIVE CYTOCHROME C OXIDASE 
POLYPEPTIDE I TRANSMEMBRANE PROTEIN" 
/protein_id="CAC47096 . 1" 
/db_xref-"GI: 15075541" . 

/trans lation="MMVDVRSGIGEALPPPEVEDVELYHPHSWWTRYVFSQDAKI IAI 
QYSMTAIAIGMVALVLSWLIRLQLGFPGTFELIDAERYYQFITMHGMIMVI YLLTALF 
LGGFGNYLIPLMVGARDMVFPYANMLSYWI YLLAVIVLAASFFTPGGPTGAGWTLYPP 
QAVLSGTPGGRDWGI IMMLSSLI I FVIGFTMGGLNYVVTVLQGRARGMTLMRLPLTVW 
GIFTATVMALLAFPALFVACVMMLFDRLLGTSFFMPAIVEMGEQLQYGGGSPILFQHL 
FWFFGHPEVYIVALPAFGIVSDLISTHARKNIFGYRMMVWAIVIIGGLSFIVWAHHMY 
VSGMNPYFGFFFATTTLIIAVPTAIKVYNWVLTLWRGNIHLTLPMLFALAFIVTFVNG 
GLTGLFLGNVVVDVPLSDTMFVVAHFHMVMGVAPIMVIFGAIYHWYPKITGRMLNEAM 
GQIHFWVTFIGAYAIFFPMHYLGLIGVPRRYHELGEASFVTTSIAELNAFISVMALLV 
GAAQI VFLFNLAWSLRHGREAGGNPWRATTLEWQT PET PPPHGNWGNELPVVYRWAYD 
YSVPGAPEDFIPQNQPTPGRLSHETVS" 



repeat_region 



complement (1457 . . 1513) 
/note="Sm-5 OR SMc04646 
REPEAT SM-5 



predicted by Homology" 

/ evidence=not_experimental 



CDS 



gene 



1634. .3415 

/gene="coxN OR SMc01983" 
1634. ,3415 



CDS 



gene 



3412. .4110 

/gene="coxO OR SMc01984 
3412. .4110 

/gene="coxO OR SMc01984 



M 



/EC_number="1.9.3.-" 

/f unction=" small molecule metabolism; energy transfer; 
electron transport" 

/note="Product confidence : probable 
Gene name confidence : probable 
predicted by Codon_usage 
predicted by Homology 
predicted by Frame D" 
/codon_start=l . 
/transl_table=ll 
/evidence=not_experimental 

/pr oduct= " PROBABLE CYTOCHROME-C OXIDASE TRANSMEMBRANE 
PROTEIN" 

/protein_id="CAC47097 . 1" 
/db_xref="GI : 15075542" 

/trans lation="MSIVAFFLAAIAAI IAWWLAGQRLTSRPWLEVGHFHDRRGATRL 

PPAKIGLGVFLAVVGALFSLAISAYFMRMASSDWGALPLPGLLWLNTGILAAGSITLH 

WTKVEAERRNDEAARIGLLAGLALGLAFLAGQLFAWRALSDAGYFLAGNPANSFFYLL 

TGMHGLHIIGGLFALGRVTAHASQTPLGNRTRLSIELCAIYWHFMLIVWLVLFALFAG 

WASGVIEFCRQLLT" 

4123. .4845 

/gene="coxP OR SMc01985" 
4123. .4845 

/gene="coxP OR SMc01985" 
/EC_number=" 1.9.3.-" 

/f unction="small molecule metabolism; energy transfers- 
electron transport" 

/note="Product confidence : probable 

Gene name confidence : probable 

predicted by Codon_usage 

predicted by Homology 

predicted by FrameD" 

/codon_start=l 

/trans l_table=ll 

/ evidence=not_experimental 

/product=" PROBABLE CYTOCHROME-C OXIDASE PROTEIN" 
/protein_id="CAC47098 .1" 
/dbjcref ="GI : 15075543" 

/ translation="MSAPMKKPAAETLPRPAGLAGLASDWASDQRTFKDVSWGKAMMW 

IFLLSDTFVFGCFLLAYMSARMSTSVPWPNPSEVFALEIGGTHMPLILIAIMTFVLIS 

SSGTMAMAVNYGYRRDRRKTAALMLLTALFGAAFVGMQAFEWSKLIAEGVRPWGNPWG 

AAQFGSTFFMITGFHGTHVTFGVIFLLIVARKVWRGDFETERRGFFTSRKGRYEIVEI 

TGL YWHFVDLVW VFI FAFFYLW " 

4857. .5216 

/gene="SMc01986" 

4857. .5216. 

/gene="SMc01986 n 

/function= "miscellaneous; hypothetical /partial homology" 

/note="Product confidence : hypothetical" 

Gene name confidence : hypothetical 

predicted by Codon_usage 

predicted by FrameD" 

/codon_start=l 

/transl_table=ll 

/evidence=not_experimental 

/product=" HYPOTHETICAL TRANSMEMBRANE PROTEIN" 
/protein_id="CAC47099 . 1" 



/db_xref="GI : 15075544" 

/trans la tion=" MAHAETHPAQHAAAHTEHQQHAIKLYLLVWGLLFVLSALSYLVD 

YFGLQGYLRWSLILIFMMLKAGLIVAVFMHMAWERLALIYAIILPPLLVLVFVALMVS 

ESNYVLFTRLTFFGGGE" 
gene 5477. .8038 

/gene="SMc01987" 
CDS 5477. .8038 

/gene="SMc01987" 

/f unction="small molecule metabolism" 
/note="Product confidence : putative 
Gene name confidence : hypothetical 
predicted by Codon_usage 
predicted by Homology 
predicted by FrameD" 
/ codon_start=l 
/trans l_table=ll 

Query Match 8.0%; Score 34.2; DB 1; Length 340900; 

Best Local Similarity 50.9%; Pred. No. 13; 

Matches 81; Conservative 0; Mismatches 78; Indels 0; Gaps 0; 

Qy 1 cggacgcgtgggtgcaatttgaggagagagacgagatcatgaggaagcaatactcccctg 60 

I I I I I III III II I I I I I II I I I I I I I I I I I I I I 
Db 18704 5 CGGGCAGGCCGTTGGGATTGATGAAATGCGACGAGGTCATGCCGATGCGGCGCGCCTCGG 187104 

Qy 61 tgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacacca 120 

III I I I I I I I I I I I I I I I II I I I I I II II 
Db 187105 CGTTCATCTTCTCGATGAAGGCCTGTTCGGATCCGCCGACCGTCTCGGCGATCGCGACGG 187164 

Qy 121 cctcggcacaagcaggaaggagtggctacaactcgtacg 159 

I III III II II I I I I II I 
Db 187165 CGACGTCGTTCGCCGACTTGATCAGCATCATCTTCAACG 187203 



RESULT 10 
GGVITIIG/c 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 



GGVITIIG 20343 bp DNA VRT 10-FEB-1999 

Chicken vitellogenin II gene. 

X13607 

X13607.1 GI:63886 
vitellogenin . 
chicken . 
Gallus gallus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; 
Phasianinae; Gallus. 

1 (bases 1 to 20343) 
Ab, G . 

Direct Submission 

Submitted (28-NOV-1988 ) AB G., Dept. of Biochemistry, Nijenborgh 
16, 9747 AG Groningen, The Netherlands 

2 (bases 1 to 20343) 

van het Schip, F . D . , . Samallo, J . , Broos,J., Ophuis,J., Mojet,M., 
Gruber,M. and AB, G . 

Nucleotide sequence of a chicken vitellogenin gene and derived 
amino acid sequence of the encoded yolk precursor protein 
J. Mol. Biol. 196 (2), 245-260 (1987) 



MEDLINE 88011328 

COMMENT The sequence overlaps with that reported by Byrne et . al. in 

Biochemistry 23:4275-4279(1984) K02113, by Nardelli et . al. in J. 
Biol. Chem. 262:15377-15385(1987) M18060, by Nardelli et . al. 
Biochemistry 26:6397-6402(1987) and Walker et . al. in EMBO J. 
2:2271-2279{ t 1983) . 
FEATURES Location/Qualifiers 
source 1 . . 20343 

/organism="Gallus gallus" 
/strain="White Leghorn" 
/db_xref="taxon: 9031" 
. /tissue_type="blood" 
/clone_lib="Charon 4A" 
/clone="22, 24" 
exon 1 . . 53 

/ number =1 

CDS join(14. .53,169. .189,290. .441,1769. .2020,2532. .2693, 

2799. .2947,3116. .3266,3385. .3542,3625. .3790,4692. 



.4817, 
.7267, 
.10103, 



5222. .5440,6011. .6228,6345. .6442,6920. .7044,7166. 

7906. .8025,8112. .8301,9055. .9148,9281. .9466,9959. 

10225. .10447,10538. .10713,11067. .11756,12221. .12271, 

12525. .12614,13054. .132 66,13362. .13504,1388 9. .14027, 

14568. .14703,15161. .15347,1567 9. .15770,16857. .16954, 

17704. .17863,18919. .19080,20003. .20121) 

/codon_start=l 

/product = "vitellogenin" 

/protein_id="CAA31942 .1" 

/db_xref="GI: 63887" 

/db_xref="SWISS-PROT:P02845" 

/translation="MRGIILALVLTLVGSQKFDIDPGFNSRRSYLYNYEGSMLNGLQD 
RSLGKAGVRLSSKLEISGLPENAYLLKVRSPQVEEYNGWPRDPFTRSSKITQVISSC 
FTRLFKFEYSSGRIGNIYAPEDCPDLCVNIVRGILNMFQMTIKKSQNVYELQEAGIGG 
ICHARYVIQEDRKNSRIYVTRTVDLNNCQEKVQKSIGMAYIYPCPVDVMKERLTKGTT 
AFSYKLKQSDSGTLITDVSSRQVYQISPFNEPTGVAVMEARQQLTLVEVRSERGSAPD 
VPMQNYGSLRYRFPAVLPQMPLQLIKTKNPEQRIVETLQHIVLNNQQDFHDDVSYRFL 
EVVQLCRIANADNLESIWRQVSDKPRYRRWLLSAVSASGTTETLKFLKNRIRNDDLNY 
IQTLLTVSLTLHLLQADEHTLPIAADLMTSSRIQKNPVLQQVACLGYSSVVNRYCSQT 
SACPKEALQPIHDLADEAISRGREDKMKLALKCIGNMGEPASLKRILKFLPISSSSAA 
DIPVHIQIDAITALKKIAWKDPKTVQGYLIQILADQSLPPEVRMMACAVIFETRPALA 
LITTIANVAMKESNMQVASFVYSHMKSLSKSRLPFMYNISSACNIALKLLSPKLDSMS 
YRYSKVIRADTYFDNYRVGATGEIFVVNSPRTMFPSAIISKLMANSAGSVADLVEVGI 
RVEGLADVIMKRNIPFAEYPTYKQIKELGKALQGWKELPTETPLVSAYLKILGQEVAF 
ININKELLQQVMKTVVEPADRNAAIKRIANQIRNSIAGQWTQPVWMGELRYVVPSCLG 
LPLEYGSYTTALARAAVSVEGKMTPPLTGDFRLSQLLESTMQIRSDLKPSLYVHTVAT 
MGVNTEYFQHAVEIQGEVQTRMPMKFDAKIDVKLKNLKIETNPCREETEIVVGRHKAF 
AVSRNIGELGVEKRTSILPEDAPLDVTEEPFQTSERASREHFAMQGPDSMPRKQSHSS 
REDLRRSTGKRAHKRDICLKMHHIGCQLCFSRRSRDASFIQNTYLHKLIGEHEAKIVL 
MPVHTDADIDKIQLEIQAGSRAAARIITEVNPESEEEDESSPYEDIQAKLKRILGIDS 
MFKVANKTRHPKNRPSKKGNTVLAEFGTEPDAKTSSSSSSASSTATSSSSSSASSPNR 
KKPMDEEENDQVKQARNKDASSSSRSSKSSNSSKRSSSKSSNSSKRSSSSSSSSSSSS 
RSSSSSSSSSSNSKSSSSSSKSSSSSSRSRSSSKSSSSSSSSSSSSSSKSSSSRSSSS 
SSKSSSHHSHSHHSGHLNGSSSSSSSSRSVSHHSHEHHSGHLEDDSSSSSSSSVLSKI 
WGRHEI YQYRFRSAHRQEFPKRKLPGDRATSRYSSTRSSHDTSRAASWPKFLGDIKTP 
VLAAFLHGISNNKKTGGLQLVVYADTDSVRPRVQVFVTNLTDSSKWKLCADASVRNAH 



intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 



KAVAYVKWGWDCRDYKVSTELVTGRFAGHPAAQVKLEWPKVPSNVRSVVEWFYEFVPG 

AAFMLGFSERMDKNPSRQARMVVALTSPRTCDVVVKLPDIILYQKAVRLPLSLPVGPR 

IPASELQPPIWNVFAEAPSAVLENLKARCSVSYNKIKTFNEVKFNYSMPANCYHILVQ 

DCSSELKFLVMMKSAGEATNLKAINIKIGSHEIDMHPVNGQVKLLVDGAESPTANISL 

ISAGASLWIHNENQGFALAAPGHGIDKLYFDGKTITIQVPLWMAGKTCGICGKYDAEC 

EQEYRMPNGYLAKNAVSFGHSWILEEAPCRGACKLHRSFVKLEKTVQLAGVDSKCYST 

EPVLRCAKGCSATKTTPVTVGFHCLPADSANSLTDKQMKYDQKSEDMQDTVDAHTTCS 

CENEECST" 

54. .168 

/ number =1 

169. .189 

/ number=2 

190. .289 

/number=2 

290. .441 

/number=3 

442. .1768 

/number =3 

1769. .2020 

/number=4 

2021. .2531 

/number=4 

2532. .2693 

/number=5 

2694. .2798 

/number =5 

2799. .2947 

/number=6 

2948. .3115 

/number=6 

3116. .3266 

/number=7 

3267. .3384 

/number=7 

3385. .3542 

/number=8 

3543. .3624 

/number=8 

3625. .3790 

/number=9 

3791. .4691 

/number=9 

4692. .4817 

/number=10 

4818. .5221 

/number=10 

5222. .5440 

/ number=l 1 

5441. .6010 

/ number=ll 

6011. .6228 

/number=12 

6229. .6344 

/number=12 

6345. .6442 

/number=13 

6443. .6919 



exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 

exon 

intron 



/ number=l 3 
6920. .7044 
/number=14 
7045. .7165 
/number=14 
7166. .7267 
/number=15 
7268. .7905 
/number=15 
7906. .8025 
/number=16 
8026. .8111 
/ number=16 
8112. .8301 
/number=17 
8302. .9054 
/number=17 
9055. .9148 
/number=18 
9149. .9280 
/number=18 
9281. .9466 
/number=19 
9467. .9958 
/nurober=l 9 
9959. .10103 
/number=20 
10104. .10224 
/number=20 
10225. .10447 
/number=21 
10448. .10537 
/number=21 
10538. .10713 
/number=22 
10714. .11066 
/number=22 
11067. .11756 
/number^23 
11757. .12220 
/number=23 
12221. .12271 
/number=24 
12272. .12524 
/number-24 
12525. .12614 
/number=2 5 
12615. .13053 
/number=25 
13054. .13266 
/number=2 6 
13267. .13361 
/number-26 
13362. .13504 
/number =2 7 
13505. .13888 
/number=27 



exon 


13889 . 


. 14027 




/ number : 


=28 


intron 


14 028 . 


. 14 567 




/ number : 


= 2 0 


exon 


14568. 


.14703 




/number : 


-2 9 


intron 


14 /U4 . 


. 15160 




/ number : 




exon 


15161. 


.15347 




/number^ 


=30 


intron 


15348. 


.15678 


Query Match 


8 


.0%; 


Best Local 


Similarity 51 


.7%; 



Matches 77; Conservative 



0; Mismatches 72; Indels 



0; Gaps 



0; 



Qy 207 ggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataagac 2 66 

III II I I I I I I I I I I I I I i I I Ml II I I I I 
Db 11049 GGAAATCACAGGAGAGGGGATACATCTGTTTCTGTATTTACAATGTGAAATATCTGTGGG 10990 

Qy 2 67 tgtcttcaacggatcacactgccgctcccccacgctaaatttgggggctacagtgcacac 32 6 

I I I I II I I I I I I I I II III III I I I I I I I I 1 I I 
Db 10989 TGTCAGACAAGCTGCTCACCGCAGCCTTTCCTGCCTCACCATGGAAGCTGCAGAGCTCAC 10930 

Qy 327 gctagccgattaacggctcacgctaccag 355 

II III II II II III 
Db 10929 TCTTCCCCACCAAAATCTGCCCCAGCAAG 10901 



RESULT 11 
AC023140/c 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



AC023140 194575 bp DNA HTG 07-JUL-2000 

Homo sapiens chromosome 15^ clone RP11-535P8, WORKING DRAFT 
SEQUENCE, 26 unordered pieces. 
AC023140 

AC023140.4 GI:8570285 

HTG; HTGS_PHASE1; HTGS__DRAFT . 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 194575) 
Waterston, R. H . 

The sequence of Homo sapiens clone 
Unpublished 

2 (bases 1 to 194575) 
Waterston, R. H . 
Direct Submission 

Submitted ( 08-FEB-2000 ) Genome Sequencing Center, Washington 
University School of Medicine, 4444 Forest Park Parkway, St. Louis, 
MO 63108, USA 

On Jun 17, 2000 this sequence version replaced gi: 7232193. 

Genome Center 

Center: Washington University Genome Sequencing Center 
Center code: WUGSC 

Web site : http : //genome . wustl . edu/gsc/ index . shtml 



Project Information 

Center project name: H_NH0535P08 

Summary Statistics 

Sequencing vector: M13; 100% 
Sequencing vector: plasmid; 0% 
Chemistry: Dye-primer ET; 100% of reads 
Chemistry: Dye-terminator Big Dye; 0% of reads 
Assembly program: Phrap; version 0.990319 
Consensus quality: 183084 bases at least Q40 
Consensus quality: 186527 bases at least Q30 
Consensus quality: 188372 bases at least Q20 
Insert size: 209000; agarose-fp 
Insert size: 192075; sum-of -contigs 
Quality coverage: 3.93 in Q20 bases; agarose-fp 
Quality coverage: 4.32 in Q20 bases; sum-of-contigs 



NOTE: This is a 'working draft 1 sequence. It currently 
consists of 26 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

contig of 1295 bp in length 
gap of unknown length 
contig of 1960 bp in length 
gap of unknown length 
contig of 1658 bp in length 
gap of unknown length 
contig of 1697 bp in length 
gap of unknown length 
contig of 2032 bp in length 
gap of unknown length 
contig of 2503 bp in length 
gap of unknown length 
contig of 2799 bp in length 
gap of unknown length 
contig of 2678 bp in length 
gap of unknown length 
contig of 2415 bp in length 
gap of unknown length 
contig of 5758 bp in length 
gap of unknown length 
contig of 3436 bp in length 
gap of unknown length 
contig of 3561 bp in length 
gap of unknown length 
contig of 5271 bp in length 
gap of unknown length 
contig of 4041 bp in length 
gap of unknown length 
contig of 5900 bp in length 
gap of unknown length 
contig of 8786 bp in length 
gap of unknown length 
contig of 8515 bp in length 



1 

1296 
1396 
3356 
3456 
5114 
5214 
6911 
7011 
9043 
9143 
11646 
11746 
14545 
14645 
17323 
17423 
19838 
19938 
25696 
25796 
29232 
29332 
32893 
32993 
38264 
38364 
42405 
42505 
48405 
48505 
57291 
57391 



1295 
1395 
3355 
3455 
5113 
5213 
6910 
7010 
9042 
9142 
11645 
11745 
14544 
14644 
17322 
17422 
19837 
19937 
25695 
25795 
29231 
29331 
32892 
32992 
38263 
38363 
42404 
42504 
48404 
48504 
57290 
57390 
65905 



* 65906 66005: gap of unknown length 

* 66006 74728: contig of 8723 bp in length 

* 74729 74828: gap of unknown length 

* 74829 87289: contig of 12461 bp in length 

* 87290 87389: gap of unknown length 

* 87390 99467: contig of 12078 bp in length 

* 994 68 99567: gap of unknown length 

* 99568 111113: contig of 11546 bp in length 

* 111114 111213: gap of unknown length 

* 111214 122406: contig of 11193 bp in length 

* 122407 122506: gap of unknown length 

* 122507 135480: contig of 12974 bp in length 

* 135481 135580: gap of unknown length 

* 135581 153618: contig of 18038 bp in length 

* 153619 153718: gap of unknown length 

* 153719 170017: contig of 16299 bp in length 

* 170018 170117: gap of unknown length 

* 170118 194575: contig of 24458 bp in length. 
FEATURES Location/Qualifiers 

source 1 . . 194575 

/ organ ism=" Homo sapiens" 

/db_xref="taxon: 9606" 

/chromosome= f, 15 n 

/clone="RPll-535P8" 
misc_feature 1. .1295 

/note= n assembly_name :Contig9" 
misc_f eature 1396. .3355 

/note="assembly_name : ContiglO" 
misc_f eature 3456. .5113 

/note= n assembly_name :Contigll" 
misc_feature 5214. .6910 

/note="assembly_name : Contigl2" 
misc_feature 7011. .9042 

/note="assembly_name : Contigl3" 
misc_f eature 9143. .11645 

/note="assembly_name :Contigl4 " 
misc_f eature 11746. .14544 

/note="assembly_name : Contigl5" 
misc_feature 14645. .17322 

/note= ,? assembly_name : Contigl6 

clone_end: T7 

vector_side : left" 
misc_feature 17423. .19837 

/note-"assembly_name :Contigl7 " 
misc_feature 19938. .25695 

/note="assembly_name : Contigl8" 
misc_feature 25796. .29231 

/note="assembly_name : Contigl9" 
misc_feature 29332. .32892 

/note="assembly_name : Contig20" 
misc_feature 32993. .38263 

/note="assembly_name : Contig21" 
misc_feature 38364. .42404 

/note="assembly_name : Contig22" 
misc_feature 42505. .48404 

/note="assembly_name :Contig23" 
misc feature 48505. .57290 



/note="assembly_name : Contig24 " 
misc_feature 57391. .65905 

/note="assembly_name : Contig25" 
misc_feature 66006. .74728 

/note="assembly_name : Contig26" 
misc_feature 74829. .87289 

/note="assembly_name : Contig27 " 
misc_feature 87390. .99467 

/note="assembly_name :Contig28" 
misc_feature 99568. .111113 

/note= n assembly_name :Contig2 9" 
misc_feature 111214. .122406 

/note="assernbly_name : Contig30" 
misc__feature 122507. .135480 

/note="assembly_name : Contig31" 
misc_feature 135581. .153618 

/note="assembly_name : Contig32 " 
misc_feature 153719. .170017 

/note="assembly_name : Contig33 

clone_end : SP6 

vector_side : right " 
misc_feature 170118. .194575 

/note="assembly_name : Contig34 " 
BASE COUNT 58990 a 40026 c 37747 g 55237 t 2575 others 
ORIGIN 



Query Match 8.0%; Score 33.8; DB 2; Length 194575; 

Best Local Similarity 53.4%; Pred. No. 18; 

Matches 71; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 293 cccccacgctaaatttgggggctacagtgcacacgctagccgattaacggctcacgctac 352 

III II I I I I I I I I I I I I I II II I I II I I I I I I 
Db 717 84 CCACTGTGATTTCTTTGCTGCTGACTGGGCCCACCTTTGAAAATCTATGGCTGACACTTC 71725 

Qy 353 caggcgctctacgcggatgtgccccctagccagcttctctctccccctcgttctgtggtg 412 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 71724 CTTACTATCTAAGATGACATCAGTGCCATCCTGTCTTTCACTCTTCCTCCTACAGGGCTT 71665 

Qy 413 cctctctcaacct 425 

I I I I I I I II 
Db 71664 TCTATCTCATTCT 71652 



RESULT 12 

AC084763 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AC084763 141307 bp DNA PLN 30-JAN-2001 

Oryza sativa chromosome 10 BAC OS JNBa0027 P10 genomic sequence, 
complete sequence. 
AC084763 

AC084763.4 GI:12597872 
HTG. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae ; Embryophyta; Tracheophyta; Spermatophyta ; 
Magnoliophyta; Liliopsida; Poales; Poaceae; Ehrhartoideae ; Oryzeae; 
Oryza . 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 



1 (bases 1 to 141307) 

Buell,C.R., Yuan,Q., Mof f at , K. S . , Hill, J.N,, Jenkins , C . N . , 

Hsiao, J. , Zismann,V., Pai,G., Bowman, C . L . , Fujii,C.Y., 

VanAken, S . E . , Craven, B., Khalak,H., Feldblyum, T . V . , Quackenbush, J . , 

White, 0., Salzberg, S . L . and Fraser,C.M. 

Oryza sativa chromosome 10 BAC OS JNBa0027 P10 genomic sequence 
Unpublished 

2 (bases 1 to 141307) 
Buell, R. 

Direct Submission 

Submitted ( 15-NOV-2000 ) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 

3 (bases 1 to 141307) 
Buell, R. 

Direct Submission 

Submitted ( 30-JAN-2001 ) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA, rbuell@tigr.org 
On Jan 30, 2001 this sequence version replaced gi: 12039410. 
Address all correspondence to : ricedtigr . org 

BAC clone OS JNBa0027P10 is from Oryza sativa chromosome 10 

The orientation of the sequence is from SP6 to T7 end of the BAC 

clone . 

Genes were identified by a combination of several methods: Gene 
prediction programs including Genscan and Genscan+ (Chris Burge, 
http://CCR-081.mit.edu/GENSCAN.html) , GeneMarkHMM (Mark Borodovsky, 
http : //genemark. biology . gatech . edu/GeneMark/ ) , Fgenesh 
(http://www.softberry.com/), and GeneSplicer (Mihaela Pertea and 
Steven Salzberg, contact mpertea@tigr.org), searches of the 
complete sequence against a peptide database and the plant EST 
database at TIGR (http://www.tigr.org/tdb/tgi.shtml). Annotated 
genes are named to indicate the level of evidence for their 
annotation. Genes with similarity to other proteins are named after 
the database hits. Genes without significant peptide similarity but 
with EST similarity are named as unknown proteins. Genes without 
protein or EST similarity, that are predicted by more than two gene 
prediction programs over most of their length are annotated as 
hypothetical proteins. Genes encoding tRNAs are predicted by 
tRNAscan-SE (Sean Eddy, http://genome.wustl.edu/eddy/tRNAscan-SE/) . 
Simple repeats are identified by repeatmasker (Arian Smit, 
http : //ftp . genome . Washington . edu/RM/RepeatMasker . html ) . 



Location/Qualifiers 
1. .141307 

/organism="Oryza sativa" 
/cultivar="Nipponbare" 
/sub_species=" japonica" 
/db_xref="taxon: 4530" 
/chromosome^" 10" 
/map-"near C239" 
/clone-"OSJNBa0027P10" 



/rpt_f amily="Explorer_Os3 MITE element from gb:D25363 Rice 
genomic DNA, G11Q3A, sequence tagged site (66 to 225) 160 
nt" 



repeat_region 



complement (784 . . 934 ) 



repeat_region 



complement (2007 . .2034) 
/rpt_family="AT_rich" 
3436. .3458 



repeat_region 



repeat_region 

repeat_region 

repeat_region 
repeat_region 
repeat_region 
repeat_region 

repeat_region 

repeat_region 

repeat_region 

mRNA 

gene 

CDS 



repeat_region 



mRNA 
13256, 



gene 



CDS 
,13256, 



/rpt_f amily=" (TTAA) n" 
3916. .4092 

/rpt_family="Wanderer_011 MITE element from gb:U34 601 
Oryza sativa wanderer mobile element linked to Xa21 (411 
to 637) 227 nt" 
complement (3975. .4073) 

/rpt_family="Tourist_012 MITE element from gb:U72727 Oryza 

longistaminata receptor kinase-like protein, family member 

A2, pseudogene sequence { 54 53 to 5694) 242 nt" 

complement (4684. .4705) 

/rpt_family=" (TA)n" 

complement (5218. .5247) 

/rpt_family="AT_rich" 

complement ( 5736 . .5768) 

/rpt_family="AT_rich" 

complement (5950. . 6118) 

/rpt_f amily="Gai j in_Os3 element from gb:D32165 Rice gene 

for aspartic protease (302 to 448) 147 nt" 

complement ( 64 35 . .6517) 

/ r p t _ f ami 1 y = " AT_r i c h " 

complement (7038. .7063) 

/rpt_family="AT_rich" 

complement (8244 . ' .8263) 

/rpt_family=" (TGG)n" 

join(<8312. .9043,9241. .9296,9519. .>9552) 
/gene="OSJNBa0027P10 . 12" 
8312. .9552 

/gene="OS JNBa0027P10 . 12" 

/note="similar to ethylene responsive element binding 
protein EREBP 4 GB:BAA07323 GI:1208497 (Nicotiana 
tabacum) " 

join(8312. .9043,9241. .9296,9519. .9552) 
/gene="OSJNBa0027P10 . 12" 
/codon_start=l 

/product="putative ethylene-responsive element binding 
protein" 

/protein_id="AAG60182 .1" 
/db_xref="GI : 12597874" 

/translation="MAGFGLDQHLDLIRAHLLEDAHHHVLAPSPSPSPPGTGRVRPAP 

VSLPPRPPLLWAAASAAPRQQEECFELGGGYAGEGEGEEEDDFRRYRGVRQRPWGKYA 

AEIRDPARKGARVWLGTYDTAVEAARAYDR7VAFQLRGSKAILNFPNEV7VADAAVKWAP 

PVAPIPAAAMSAGRGKRVRSEEQYYLREVKKERLIMAPPENSSSSSSSAAAAAGDIWD 

ELKGICSLPPLSPLSPHPHMAFPQLFVIDLAFGQILNSFVLLLLRTRDFAFDYAI" 

8539. .8564 

/rpt_family=" (AGGGGG)n" 

join (<11624 . . 11732, 11854 . . 1 1938 , 13068 . .1314 0, 13236. 

13343. .13396,13879. .13929,14124. .1418 6,15136. .15201, 
15308. .15346,16032. .16154,1667 4. .167 66,16873. .16954, 
17067. .17419) 
/gene="OSJNBa0027P10. 1" 
11624. .17419 
/gene="OSJNBa0027P10. 1" 

/note="similar to palmitoyl protein thioesterase 
GB:AAA85337 GI:1160967; EST AU065981 from this gene" 
join (11624 . . 11732 , 11854 . . 11 938 , 13068 . .13140, 13236. 



.13929, 14124 
.16154,16674, 



14186, 15136. 
16766, 16873. 



15201, 
16954, 



repeat_region 
repeat_region 

repeat_region 
repeat_region 

repeat_region 
mRNA 

gene 



CDS 
743) 



repeat__region 
repeat_region 
mRNA 



13343. .13396,13879. 
15308. .15346, 16032. 
17067. .17095) 
/gene="OSJNBa0027P10 .1" 
/codon_start=l 

/product="putative palmitoyl-protein thioesterase" 
/protein_id="AAG60184 .1" 
/db_xref="GI : 12597876" 

/ 1 r ans 1 a t i on= "MAYALRGAALVGVLLLVVAS PALVP VAS AVP FI VLHGI GDQCEN 

GGMASFTEMLGEWSGSKGYCILNVFLLSEIGRGAWDSWLMPLQEQADTVCKKVKKMKE 

LRKGYSIVGLSQGNLIGRAVIEYCDGGPPVKNFISIGGPHAGTASVPLCGSGIVCVLI 

DALIKLEIYSNYVQAHLAPSGYLKIPTDMTDYLKGCKFLPKLNNEIPSERNATYKQRF 

SSLENLVLIMFEDDAVLIPRETAWFGYYPDGAFSPVQPPQKTKLYTEDWIGLKALEEA 

GRVKFVSVPGEEGVPSDQF" 

13558. .13589 

/rpt_f amily-" (TAAAAA) n" 

complement (15485. .15668) 

/rpt_f amily="Wanderer_Os5 MITE element from gb:X13679 
Oryza sativa H3 histone pseudogene H3R-12 (2 to 187) 186 
nt" 

complement (15516 . .15571) 
/rpt_family="AT__rich" 
15563. .15647 

/rpt_f amily="Wanderer_011 MITE element from gb:U34 601 
Oryza sativa wanderer mobile element linked to Xa21 (411 
to 637) 227 nt" 
complement (17910. .17 957) 
/rpt_f amily=" (CGG) n" 
join (<197'96. .21016, 21710 
22644. .>22743) 
/gene="OSJNBa0027P10. 13" 
19796. .22743 
/gene-"OSJNBa0027P10 .13" 
/note="predicted by fgenesh" 
join (19796. . 21016, 21710 . .21802, 22214 



21802, 22214 . .22341, 



.22341,22644. 



/gene="OSJNBa0027P10. 13" 
/codon_start=l 

/product="hypothetical protein" 
/protein_id="AAG60187 .1" 
/db_xref="GI : 12597879" 

/trans lation-"MSDEFDNSIDPAEIYTTDMFMAEHSVLNSFAGRIDRRIKARLEG 
GTSRRTSGPRKYINRNHEGAHDQLFADYFAEDPLYSAATFRRRFRMRRHVFLHIVDEL 
GKWSSYFTHRVDCTGCLGHSPLQKCTAAIRMLAYGTAADTLDEYLKVPQSTALECLEN 
FVEGVVEVFSSRYLRRPTAEDLERLLQVGESRGFPGMLGSIDCMHWRWKNCPTAWKGQ 
YTRGDQKYPTIILEAVASYDLHIWHAFFGIPGSNNDINVLNQSPLFIEAIKGEAPQIQ 
FIVNGTQYNTGYYLADGI YPEWAAFVKSIRSPQLEKHKLFAREQEGKRKDIERAFGVL 
QARFNIVHRPARSWSQKVLRKIMQACVILHNMIVEDEGEMAEDPIDLNAAPGTSIVLP 
PEVHAGSNDHPSFSDMCTIYELAVVSTFEGGLTKISAVQNDSVVRMAGECRRDRRCRV 
KNPMPKKKTMLGLMAPQTVDEGTDQGDVNGSSIFLAGGEGSHPSPKYSTKHQFSLYTL 
QIPET" 

complement (21831. .21851) 
/ rpt_f ami 1 y- " AT_r i ch " 
complement (24477. .24 50 6) 
/rpt_family="AT_rich" 

complement ( j oin (<25322 . . 25620, 2 6566 . . 2 662 9, 27116 . .27190, 
27868. .27962,28341. .28428,28538. .28665,29324. .29455, 



29557. .29637,29724. .2 9882,2 9980. .>31398)) 
/gene="OSJNBa0027P10. 11" 
gene complement (25322 . .31398) 

/gene="OSJNBa0027P10 . 11" 

/note="similar to arm repeat protein GB:AAF33245 

GI: 6959880 (Drosophila melanogaster) ; EST D22130 from this 

gene" 

CDS complement (join (25606. .25620,26566. .26629,27116. .27190, 

27868. .27962,28341. .28428,28538. .28 665,29324. .29455, 
29557. .29637,29724. .29882,2 9980. .31398)) 
/gene="OSJNBa0027P10. 11" 
/codon_start=l 

/product="putative arm repeat protein' 1 
/protein_id= ,f AAG60190. 1" 
/db_xref="GI : 12597882" 

/trans lation="MTRRVRRRLCKDGGKGKDVAADEERELVSCSSSSRRRGGLGVAV 
AARGGGGGGSGSCVVDWRTLPDDTVLQLFGRLNYRDRASMAAACRTWRDLGASPCLWS 
ALDLRAHRCDAEVASSLSSRCGSLRRLRLRGHEAAAAASGLRARGLREVVADGCRGLT 
DATLAVLAARHEALESLQIGPDPLERISSDALRQVAFCCSRLRRLRLSGLRDADADAI 
GALARYCPLLEDVAFLDCGSVDEAAIAGILSLRFLSVAGCHNLKWATASTSWAQLPSL 
VAVDVSRTDVSPSAISRLISHSKTLKLICTLNCKSVEEEQAHNPGAFSNSKGKLVLTI 

Query Match 7.9%; Score 33.6; DB 8; Length 141307; 

Best Local Similarity 61.4%; Pred. No. 21; 

Matches 54; Conservative 0; Mismatches 34; Indels 0; Gaps 0; 

Qy 165 gatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccctagctaggc 22 4 

I I I 1 I I III I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 99091 GGTGAAGGTGACGGAGCCAACACCCTTGATCTCCACGCCGGAGGCGTCCCCGAACTTGAC 99150 

Qy 225 ggtggatccgagcctgtatcagaaatcg 252 

I I I I I I I I I I I I I I I I I 1 
Db 99151 GGTGCCTCGGACGCTGGAGTCGAACTCG 9917 8 



RESULT 13 

AP002836 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



AP002836 146921 bp DNA PLN 26-JAN-2001 J 

Oryza sativa genomic DNA, chromosome 1, PAC clone : P0512G09 . 
AP002836 

AP002836. 1 GI: 9711819 

Oryza sativa (cultivar : Nipponbare ) DNA, clone : P0512G0 9 . 
Oryza sativa 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 

1 (bases 1 to 146921) 

Sasaki, T., Matsumoto,T. and Yamamoto,K. 

Oryza sativa nipponbare (GA3 ) genomic DNA, chromosome 1, PAC 
clone :P0512G09 

Published Only in DataBase (2000) In press 

2 (bases 1 to 146921) 

Sasaki, T., Matsumoto,T. and Yamamoto,K. 
Direct Submission 

Submitted ( 03-AUG-2000) Takuji Sasaki, National Institute of 
Agrobiological Resources, Rice Genome Research Program; Kannondai 



2-1-2, Tsukuba, Ibaraki 305-8602, Japan 

(E-mail : tsasaki@abr . af f rc . go. jp, URL: http : //rgp. dna . af f rc . go . jp/ , 
Tel: 81-298-38-7441, Fax:81-298-38-74 68) 
COMMENT Genes were predicted from the integrated results of the following: 

GENSCAN1.0, BLASTN2 . 0 , BLASTX2 . 0 as well as SplicePredictor 
(October 1998 version) . The genomic sequence was searched against 
NCBI NonReduntant Protein database, nr 

(ftp://ncbi.nlm.nih.gov/blast/db) and the cDNA sequence database at 
RGP. Protein homologies of the coding regions were searched against 
NCBI NonReduntant Protein database with BLASTP2 . 0 . ESTs represent 
the identified cDNA sequences using BLASTN 2.0 with the 
corresponding DDBJ accession no. and RGP clone ID. 
A gene with identity or significant homology to a protein is 
classified based on the protein name to indicate the homology level 
such as same name, 'putative-' and f -like protein 1 . A gene without 
significant homology to any protein but with EST homology (covering 
almost the entire length of partial sequence) is classified as an 
1 unknown 1 protein. A gene predicted with a gene prediction program 
is classified as a 'hypothetical 1 protein. 

The orientation of the sequence is from T7 to SP6 of the PAC clone. 
This sequence of P0512G09 clone has an overlap with P0695A04 (DDBJ: 
AP002816) clone at the 5 1 end and an overlap with P0710E05 (DDBJ: 
AP002743) at the 3' end. The sequence of this clone starts at the 
position 46972 of P0695A04 and ends at the position 31325 of 
P0710E05. Detailed information on overlap and assembly quality 
together with annotation of this entry is available at 
http : / /rgp . dna . af f rc . go . j p/GenomeSeq . html . 
FEATURES Location/Qualifiers 
source 1. .146921 

/organism="Oryza sativa" 

/cultivar="Nipponbare" 

/db_xref="taxon: 4530" 

/chromosome=" 1 " 

/clone="P0512G09" 

gene join(5036. .5129,5308. .5485,5682. .5763,5786. .7147) 

/gene="P0512G09. 1" 
CDS join(5036. .5129,5308. .5485,5682. .5763,5786. .7147) 

/gene="P0512G09. 1" 

/note="contains ESTs 

AU07 0282 (S2132 6) , C2 6605 (C12 67 6) , C724 75 (El 678 ) , 

D15872 (CI 4 48) ,AU091541 (C1267 6) 

unknown protein" 

/codon_start=l 

/protein_id="BAB07919. 1" 

/db_xref="GI: 9711820" 

/ trans la tion="MPRYVAADCGRSNVAQTLVNPCAITLPSKIKAPPVTEESPYPLS 
LPCGPMTRGHHQTTT I PTNQTETRRRRRQRQRRGS ILLASRS I RDVTPS DQPSS 1 1 AK 
PTSEISSSIVVVVAVEMSSLTAAQFRSLFEMAGGEDISFLFGDDEAAAVAAPVELQGK 
RGWEEVDQGEGSGAVAAKRQRSPTSSRENSSGSNEGGQEEVSEAAAAMAAAVGRGGGR 
RLWVKERDSEWWDMVSSPDYPDSEFRKAFRMSKATFEVVCDELAAAVAKEDTMLRAAI 
PVRKRVAVCVWRLATGEPLRLVSKRFGLGISTCHKLVLEVCAALKAMVMPKVVRWPEA 
GDAAAIAAHFEAISGISGWGAIYTTHIPIIAPKSNVASYYNRRHTERNQKTSYSMTV 
QCVVDSTGAFTDVCIGWPGSNSDEEVLEKSALYLHRGVPGLIQGQWVVGGGSFPLMDW 
MLVPYTHQNLTWAQHMLNEKVAAVRGVARDAFERLKRRWGCLQKRTEVKLLDLPTVLG 
ACCVLHNICERSGDAVVDADDCAFDLFDDDMVAENAVRSSAAAQARDAIAHNLLHSGG 
GASFF" 

gene join(9440. .9958,10386. .10709) 



/gene="P0512G09.2" 

join(9440. .9958,10386. .10709) 

/gene="P0512G09.2" 

/note="contains ESTs AU067 976 (C11383) , AU067977 (C11383) 

unknown protein" 

/codon_start=l 

/protein_id="BAB07920. 1" 

/db_xref="GI: 9711821" 

/trans la t ion- "MADDGDGDCVNLEPFFYDEAATVAEAAAAAERREREEQEKAREA 
AANARRWAAHNAALAGIREYDPAEETYI YTRYHYADLSEFDLDEESRLPPMRHTAATY 
APPARALHFLCDMINVLAVRIILPSSDRSDGGGVGFPISVYGSVIARDQLDYKCVHLF 
RRCRDDPQLITSEDELSLILTGPHRGLVLYDALYIEVDLKMKVKGDQQQGCKDKRLSK 
GLIVLDGVLLSTNLSDHLR7VAVKTATLDRRSTMPCAVQVTYAYVTRAVEATVSVELLH 
DQGG" 

complement (join (14232. . 14 2 99, 1564 7 . . 157 98 , 18 978 . . 18994) ) 
/gene="P0512G09.3" 

complement (join (14 232. . 14 2 99, 1564 7 . . 157 98 , 18 97 8 . . 18 994) ) 

/gene="P0512G09.3" 

/note="hypothetical protein" 

/codon_start=l 

/protein_id="BAB07921 . 1" 

/db_xref="GI : 9711822" 

/translat ion="MSGIYSACGVASVAAGSGRGRATVPLTSFNAWRRVGHYRSPQTA 

PEGVEGGRERREVRIRRGEGESESDSKADTRGTV" 

join(19202. .19243,19551. .19874) 

/gene="P0512G09.4" 

join(19202. .1924 3,19551. .19874) 

/gene="P0512G09.4" 

/note="hypothetical protein" 

/codon_start=l 

/protein_id="BAB07922. 1" 

/db_xref="GI : 9711823" 

/translat ion="MGRRWRLGFSPPAQTKHVRVFSSGTDHQLLRSLVNPGEEVILTV 
PNEQLEHVVRFLLVTRITRIMTSDDVLVRSLGNAYFLVPTMVNLHPTLAAARLDGRVK 
VSSACGAAKACSVGRLSRK" 

complement (join (20820. . 20 996, 212 94 . .21316,22118. .22199, 

22461. .23639)) 

/gene="P0512G09.5" 

complement (join (20820. . 20996, 212 94 . . 21316, 221 18 . .22199, 
22461. .23639)) 
/gene="P0512G09.5" 
/note="hypothetical protein 

similar to Oryza sativa chromosome 6, P0675A05.27" 
/codon_start=l 
/protein_id="BAB07923. 1" 
/db_xref="GI: 9711824" 

/translat ion="MKAILCVFEHLSGLKINFHKSEIFCFGGAKEMEDQYRQLFGCNS 

GSFPFRYLGIPIHYRKLRNSDWKCVEDMFQRKLSTWKGKNNSYGGRLVLLNSVLSSLP 

MFMLLFFEIPRGVLQRLDYYRSSFFWSSDSQKKKYRLTKWDYIYRLKDQGGLGVLNLD 

IMNRCLLSKWLFKLLNGEGLWQNLLRNKYLKGKPLSHMSHKPGTSQFWAGLMKVKDQF 

FQYGSFKPGNGMEIRFWEDTWLGLQPLKYQYPSLYNVVRKRHSTLVEVMSTTPLNVSF 

RRSIVGPKLVEWNDLISRLANITLSNEKDCFIWSLYKNGHFSVKSiyiYNAIINSNVIlH 

KRILWKVKVPLKIKVFMWFLHKKVILTKDNLIKRKWRGNKQCCFCNTQETIQHFFFDC 

HEQQDTMSNGATRLKLVAKELLLQFGWRVAVSEQLAKEQPSPLPYIVDYTGETLLESR 

KDELLLKGAMEINYLGDVVGVVCRRVFVDFLSHWKV" 

join(25126. .25245,25311. .25409,25606. .25740) 

/gene="P0512G09.6" 



join{25126. .25245,25311. .25409,25606. .25740) 
/gene="P0512G09. 6" 

/note="contains ESTs C26486 (C12424 ) , AU091537 (C12424 ) 

unknown protein" 

/codon_start=l 

/protein_id="BAB07 924 .1" 

/db_xref="GI : 9711825" 

/translation="MISSSEEATTSYLASCLTLLGLCFCVLVGVVRRKRGVRGELCVL 
TSLSVSHLISLSLFLPHVLYCSLISIFFEISMGFGGLTEESEHTRCFLSMQEPCWELC 
VLTLYLSAVHHAQYD" 
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/gene="P0512G09. 7" 














join (32 
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/gene-" 


P0512G09.7" 














/note=" 


contains EST C72119(E 


1022) 











unknown protein" 
/codon__start=l 
/protein_id="BAB07925'. 1" 
/db_xref="GI : 9711826" 

/translation="MAPHLHLRLLLVVGVALGSAGLGWGGGGGEGEATDREAPDPYSI 
LTWHDYSPPSPPPPPPPPVAPAATCEGDLHGKGNFSTRCEVSEEVELGGDVYITGEGS 
LVLLAGAALTCQRPGCVISANLSGEVRLGRGVRVIAGRVSLAAANVTIADTVVVNTTA 
LAGDPPERTSGVPTGTHGDGGGHGGRGASCYVKDGQTQEDSWGGDAYAWSDLEHPFSY 
GSKGGSTSVEKDYGGSGGGIVWLYADDLIMNGTVLADGGDSSEKGGGGSGGSIYIKSK 
TMHGAGKISASGGNGLAGGGGGRVSINVFSRHDDTQVFAHGGKSSGCPDNAGAAGTLY 
EAVPKSLVVSNNNLSTQTDTLLLEFPNQPLWTNVFVKNHAKVAVPLLWSRVQVQGQLS 
LLSGAILTFGLTRYPYSEFELMAEELLMSDSTIKVFGALRMSVKMLLMWNSKMLIDGG 
GDSIVATSLLDASNLIVLKESSVIHSNANLGVRGQGLLNLSGEGDIIEAQRLILSLFY 
SIKVGPGSILRGPLVNGSSGDVAPKLNCDDDICPVEIIHPPEDCNLNSSLSFTLQVCR 
VEDIDIWGLVQGTVIHFNRARSVSVHTSGTISATGLGCRSGVGQGKILNSGVSGGGGH 
GGRGGDGFYNESHAEGGSMYGSADLPCELGSGSGNDTTKLSTAGGGIIVMGSWEYSLP 
SLSLYGSVESNGQSSTDVVTNASIGGPGGGSGGTILLFVRALSLAESSILSSVGGLGN 
FGSGGGGGGRIHFHWSNIPTGDEYVPVAAVKGSIRTSGGISKGKGFPGENGTVTGKAC 
PKGLYGTFCKECPLGTYKNVTGSSKSLCVQCPPDELPHRAIYTSVRGGAYETPCPYKC 
VSDRYRMPHCYTALEELI YTFGGPWLFGLLLSGLLVLLALVLSVARMKFVGTDELPGP 
APTQQGSQIDHSFPFLESLNEVLETNRAEESHGHVHRMYFMGPNTFSEPWHLPHTPPE 
QI SE I VYEDAFNRFVDEINTLAAYQWWEGS IHS ILCVLAYPLAWSWQQFRRRKKLQRL 
REFVRSEYDHSCLRSCRSRALYEGLKVTATPDLMLGYLDFFLGGDEKRPDLPPRLRQR 
FPMCLIFGGDGSYMAPFSLHSDSVLTSLMSQAVPSSIWHRLVAGLNAQLRLVRRGSLR 
GTFLPVLDWLETHANPSLGVNGVRVDLAWFQATALGYCQLGLVVYAVEEPMSAELDGS 
PRIKIEQHSLITGGILDSNSLRTLKDRRDLFYPFSLILHNTKPVGHQDLVGLVISILL 
LADFSLVLLTFLQLYSYSMADVLLVLFVLPLGILSPFPAGINALFSHGPRRSAGLARV 
YALWNITSLVNVVVAFACGLVHYKSSTKRHPSTQPWNLGTDESGWWLFPTGLMLLKCI 
QARLVDWHVANLEIQDRAVYSNDPSIFWQS" 



gene join(44868. .4 4 915,4 5001. .45052,47765. .47937,48241. 

.48336, 

48920. .49009,49126. .4 9374,4 9478. .4 9657,51850. .51950, 
52039. .52204) 
/gene="P0512G09.8" 
CDS join(44868. .44915,45001. .45052,47765. .47937,48241. 

.48336, 

48920. .49009,49126. .49374,49478. .49657,51850. .51950, 

52039. .52204) 

/gene="P0512G09.8" 

/note="hypothetical protein 

similar to Arabidopsis thaliana, M7J2.190" 

/codon_start=l 

/protein_id="BAB07926. 1" 

Query Match 7.9%; Score 33.6; DB 8;- Length 146921; 

Best Local Similarity 61.4%; Pred. No. 21; 

Matches 54; Conservative 0; Mismatches 34; Indels 0; Gaps 0; 

Qy 165 gatggaaggggtggatacaactctgttcccatcaacggcggtggca,gcccctagctaggc 224 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 138989 GGTGAAGGTGACGGAGCCAACACCCTTGATCTCCACGCCGGAGGCGTCCCCGAACTTGAC 139048 

Qy 225 ggtggatccgagcctgtatcagaaatcg 252 

MM I! II III I I I I I I I 
Db 13904 9 GGTGCCTCGGACGCTGGAGTCGAACTCG 13907 6 



RESULT 14 
AC024105/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AC024105 162700 bp DNA HTG 06-SEP-2000 

Homo sapiens chromosome 12 clone RP11-571I17, WORKING DRAFT 
SEQUENCE, 14 unordered pieces. 
AC024105 

AC024105. 13 GI: 9966537 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

human. 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 162700) 

Muzny , D . M . , Adams, C, Bailey, M., Barbaria,J., Blankenburg, K . , 
Bodota,B., Bouck,J., Bowie, S., Brooks, A., Buhay,C, Bunac,C, 
Burkett,C, Burrows, J., Carter, M., Chacko,J., Chen,Z., Cox,C, 
David, R., Delgado,0., Deshazo,D., Ding,Y., Domah-Rashid, N . , 
Dugan-Rocha, S . , Durbin, K. J. , Fernandez, C . , Ferraguto, D. , 
Forcum-Tansey, J. , Frantz,P., Ganesh,R., Gorrell, J. H . , Gorrell, L . L . , 
Guevara, W., Harris, K. , Hernandez , J . , Hodgson, A., Hogues,M., 
Holloway,C, Hosak,H., Jackson, L . E . , Jackson, L., Jia,Y., Jones, M., 
Kelly, S., Kondejewski,N. , Kong,Y., Kovar,C, Leal,B., Li,Z., 
Lichtarge, O. , Liu, J., Liu,W., Logan, 0., Lozado,R.J., Lu,J., 
Lucier,R., Martin, R., Martinez, C, McLeod,M.P., Mei,G., Morgan, M., 
Morris, S., Nash,S., Nelson, A., Nguyen, R., Nguyen, N., Nguyen, S., 
Oswal,G., Parish, B., Paxton,S., Payton,B., Perez, L., Pu,L.L., 
Quiles,M., Reiter,D., Rives, M., Samuel, S., Say, J., Scherer,S., 
Shah,E., Shen,H., Simon, M., Sparks, A., Stamps, A., Sucgang,R., 
Tabor, P., Taylor, T., Vasquez,L., Vinson, R., Vo,Q., Wahbah,M,, 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



Watlington, S . , Weinstock, G . , Weinstock, I.R. , Williamson, A . , 
Worley,K., Wren, J., Wrensf ord, G . , Yu,W., Zhou,X., Nelson, D. and 
Gibbs, R. 

Direct Submission 

Unpublished 

2 (bases 1 to 162700) 

Worley,K.C. 

Direct Submission 

Submitted (24-FEB-2000) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Sep 5, 2000 this sequence version replaced gi:9438319. 
Genome Center 

Center: Baylor College of Medicine 

Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact: hgsc-help@bcm.tmc.edu 
Project Information 

Center project name: HAHT 

Center clone name: RP11-571I17 
Summary Statistics 

Sequencing vector: M13; L08821 

Chemistry: Dye-terminator Big Dye: 100% of reads 

Assembly program: Phrap; version 0.990329 

Consensus quality: 144080 bases at least Q40 

Consensus quality: 155185 bases at least Q30 

Consensus quality: 158367 bases at least Q20 

Estimated insert size: 159112; sum-of-cont igs estimation 

Quality coverage: Ox in Q20 bases; agarose-fp estimation 

Quality coverage: 4x in Q20 bases; sum-of -contigs estimation 



NOTE: Estimated insert size may differ from sequence length 

( see http : / /www . hgsc . bcm . tmc . edu/docs /Genbank_draf t_data . html ) 
NOTE: This is a 'working draft 1 sequence. It currently 
consists of 14 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 
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35277: 


contig 


of 35277 


bp in 


length 


35278 


35377 : 


gap of 


unknown 


length 




35378 


50496: 


contig 


of 15119 


bp in 


length 


50497 


50596: 


gap of 


unknown 


length 




50597 


69110: 


contig 


of 18514 


bp in 


length 


69111 


69210: 


gap of 


unknown 


length 




69211 


82030: 


contig 


of 12820 


bp in 


length 


82031 


82130: 


gap of 


unknown 


length 




82131 


94011: 


contig 


of 11881 


bp in 


length 


94012 


94111: 


gap of 


unknown 


length 




94112 


104185: 


contig 


of 10074 


bp in 


length 


104186 


104285: 


gap of 


unknown 


length 




104286 


115314 : 


contig 


of 11029 


bp in 


length 


115315 


115414 : 


gap of 


unknown 


length 




115415 


124235: 


contig 


of 8821 


bp in 


length 


124236 


124335: 


gap of 


unknown 


length 





FEATURES 

source 



124336 
133849 
133949 
143517 
143617 
151727 
151827 
157898 
157998 
161084 
161184 



BASE COUNT 
ORIGIN 



46705 



133848: cont 
133948: gap 
143516: cont 
143616: gap 
151726: cont 
151826: gap 
157897: cont 
157997: gap 
161083: cont 
161183: gap 
162700: cont 
Location/Qualif i 
1. .162700 
/organism="Homo 
/db_xref ="taxon : 
/chromosome=" 12 " 
/clone="RPll-571 
a 33082 c 32731 



ig of 
of un 
ig of 
of un 
ig of 
of un 
ig of 
of un 
ig of 
of un 
ig of 
ers 



9513 
known 

9568 
known 

8110 
known 

6071 
known 

3086 
known 

1517 



bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 
length 
bp in 



length 
length 
length 
length 
length 
length . 



sapiens" 
9606" 

117" 
g 48849 t 



1333 others 



Query Match 7.9%; Score 33.6; DB 2; Length 162700; 

Best Local Similarity 52.1%; Pred. No. 21; 

Matches 75; Conservative 0; Mismatches 69; Indels 0; Gaps 

Qy 53 ctcccctgtgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcat 112 

I I I I I I II Mil I III III III II I I I I I I I I I I I 
Db 43901 CTCCTCGGGTCACCGCTTCACTCCCCTGGGACCTGATGTAACTATCATGATCAATGTCAT 43842 

Qy 113 gtacaccacctcggcacaagcaggaaggagtggctacaactcgtacgaacctgatggaag 172 

I I I I I I II I III I I I I I I I II II I I I I 

Db 4 3841 GTACAGCCTTCCAGATCTTCTCCTGAGGTTTTGCTAACATATTTACCCACGGATAGGAAT 43782 

Qy 173 gggtggatacaactctgttcccat 196 

II I III II III II 
Db 43781 TGGAGCATATAATGTTTTTGTCTT 43758 



RESULT 15 
AC074345/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 



AC074345 178141 bp DNA HTG 29-AUG-2000 

Homo sapiens chromosome 12 clone RP11-349K16, WORKING DRAFT 
SEQUENCE, 23 unordered pieces. 
AC074345 

AC074345.3 GI:9937816 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 178141) 
Waterston, R. H . 

The sequence of Homo sapiens clone 
Unpublished 

2 (bases 1 to 178141) 
Waterston, R. H . 
Direct Submission 



JOURNAL Submitted (28-JUL-2000 ) Genome Sequencing Center, Washington 

University School of Medicine, 4444 Forest Park Parkway, St. Louis, 
MO 63108, USA 

COMMENT On Aug 29, 2000 this sequence version replaced gi: 9665208. 

Genome Center 

Center: Washington University Genome Sequencing Center 
Center code: WUGSC 

Web site : http : //genome . wustl . edu/gsc/ index . shtml 

Project Information 

Center project name: H_NH0349K16 

Summary Statistics 

Sequencing vector: M13; 100% 
Sequencing vector: plasmid; ' 0% 
Chemistry: Dye-primer ET; 100% of reads 
Chemistry: Dye-terminator Big Dye; 0% of reads 
Assembly program: Phrap; version 0.990319 
Consensus quality: 165955 bases at least Q40 
Consensus quality: 169539 bases at least Q30 
Consensus quality: 171386 bases at least Q20 
Insert size: 173000; agarose-fp 
Insert size: 175941; sum-of -cont igs 
Quality coverage: 4.39 in Q20 bases; agarose-fp 
Quality coverage: 4.46 in Q20 bases; sum-of-contigs 



* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 23 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 



* be preserve'd. 
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contig of 1055 
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contig of 1367 
gap of unknown 
contig of 1397 
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contig of 1775 
gap of unknown 
contig of 1754 
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contig of 1442 
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contig of 2386 
gap of unknown 
contig of 2444 
gap of unknown 
contig of 4003 
gap of unknown 
contig of 4614 
gap of unknown 
contig of 3333 



FEATURES 

source 



178141 
Location/Qualif iers 
1. .178141 

/organism="Homo sapiens" 
/db_xref="taxon:9606" 
/chromosome=" 12 " 
/clone="RPll-349K16" 



5 bp in 
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0 bp in 
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9 bp in 
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8 bp in 
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3 bp in 
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bp in 
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bp in 
length 
bp in 
length 
bp in 
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bp in 
length 
bp in 
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length 
length 
length 
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length . 
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a 35937 c 36052 


: g 
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ORIGIN 



Query Match 7.9%; Score 33.6; DB 2; Length " 178 1 4 1 ; 

Best Local Similarity 52.1%; Pred. No. 21; 

Matches 75; Conservative 0; Mismatches 69; Indels 0; Gaps 0; 

Qy 53 ctcccctgtgctctacttctgcctgatggcccttgtcgtagctgctatggtctgtgtcat 112 

I I I I I I II I I I I I III I I I I I I I I I I I I I I I I J I I 
Db 11515 CTCCTCGGGTCACCGCTTCACTCCCCTGGGACCTGATGTAACTATCATGATCAATGTCAT 11456 

Qy 113 gtacaccacctcggcacaagcaggaaggagtggctacaactcgtacgaacctgatggaag 172 

I I I I I I II I III I I I I I I III II I I M 

Db 114 55 GTACAGCCTTCCAGATCTTCTCCTGAGGTTTTGCTAACATATTTACCCACGGATAGGAAT 11396 

Qy 173 gggtggatacaactctgttcccat 196 

I I I I I I I I III II 
Db 11395 TGGAGCATATAATGTTTTTGTCTT 11372 



Search completed: February 7, 2002, 11:03:09 
Job time: 9715 sec 

GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: February 7, 2002, 10:59:55 ; Search time 428.31 Seconds 

(without alignments) 
850.700 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-394-745-6332 
425 

1 cggacgcgtgggtgcaattt 



tgtggtgcctctctcaacct 425 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 930621 seqs, 428662619 residues 

Total number of hits satisfying chosen parameters: 1861242 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : N_Geneseq_1101 : * 

1 : /SIDS2/gcgdata/geneseq/geneseqn/NA1980 . DAT: * 

2 : /SIDS2/gcgdata/geneseq/geneseqn/NA1981 .DAT: * 

3 : /SIDS2/gcgdata/geneseq/geneseqn/NA1982 . DAT : * 

4 : /SIDS2/gcgdata/geneseq/geneseqn/NA1983.DAT: * 

5: /SIDS2/gcgdata/geneseq/geneseqn/NA1984 .DAT: * 

6 : /SIDS2/gcgdata/geneseq/geneseqn/NA1985 . DAT: * 

7 : /SIDS2/gcgdata/geneseq/geneseqn/NA1986. DAT: * 

8 : /SIDS2/gcgdata/geneseq/geneseqn/NA1987 . DAT: * 

9: /SIDS2/gcgdata/geneseq/geneseqn/NA1988 . DAT : * 
10 : /SIDS2/gcgdata/geneseq/geneseqn/NAl 989 . DAT : * 
11 : /SIDS2/gcgdata/geneseq/geneseqn/NAl990 . DAT : * 
12 : /SIDS2/gcgdata/geneseq/geneseqn/NA1991 . DAT : * 
13 : /SIDS2/gcgdata/geneseq/geneseqn/NA1992 . DAT : * 
14 : /SIDS2 /gcgdata/geneseq/ geneseqn/NAl 993 . DAT : * 
15 : /SIDS2/gcgdata/geneseq/geneseqn/NA1994 . DAT : * 
16 : /SIDS2/gcgdata/geneseq/geneseqn/NA1995 . DAT : * 
17 : /S IDS 2 /gcgdata/geneseq/ geneseqn/NAl 996 . DAT : * 
18 : /SIDS2/gcgdata/geneseq/geneseqn/NA1997 . DAT : * 
19 : /SIDS2/gcgdata/geneseq/geneseqn/NA1998 . DAT : * 
20: /SIDS2/gcgdata/geneseq/geneseqn/NA1999.DAT: * 
21 : /SIDS2/gcgdata/geneseq/geneseqn/NA2000 . DAT : * 
22 : /SIDS2/gcgdata/geneseq/geneseqn/NA2001 . DAT : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AAC69485 

ID AAC69485 standard; cDNA; 1559 BP. 
XX 

AC AAC69485; 
XX 

DT 30-JAN-2001 (first entry) 
XX 

DE Human secreted protein gene 31 SEQ ID NO: 41. 
XX 

KW Human; secreted protein; diagnosis;* immunosuppressive; antiarthritic; 

KW antirheumatic; antiproliferative; cytostatic; cardiant; vasotropic; 

KW cerebroprotective; nootropic; neuroprotective; antibacterial; virucide; 

KW fungicide; ophthalmological ; gene therapy; autoimmune disease; infection; 

KW hyperprolif erative disorder; cardiovascular disorder; angiogenesis ; 

KW cerebrovascular disorder; nervous system disorder; ocular disorder; 

KW wound healing; skin aging; food additive; preservative; ss. 



OS Homo sapiens. 
XX 

PN WO200058469-A1. 
XX 

PD 05-OCT-2000. 
XX 

PF 23-MAR-2000; 2000WO-US0757 9 . 
XX 

PR 26-MAR-1999; 99US-012 650 9 . 

PR 07-JAN-2000; 2000US-0174853 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM, Komatsoulis G; 
XX 

DR WPI; 2000-594642/56. 

DR P-PSDB; AAB38233. 
XX 

PT Isolated nucleic acid molecule encoding a human secreted protein is 

PT used in preventing, treating or ameliorating a medical condition 
XX 

PS Claim 1; Page 348; 416pp; English. 
XX 

CC The polynucleotide sequences given in AAC69455 to AAC69502 encode the 

CC human secreted proteins given in AAB38203 to AAB38250. AAB38251 to 

CC AAB38320 represent human secreted polypeptide sequences and proteins 

CC homologous to them, which are given in the exemplification of the present 

CC invention. Human secreted proteins have activities based on the tissues 

CC and cells the genes are expressed in. Example of activities include: 

CC immunosuppressive; antiarthritic; antirheumatic; antiproliferative; 

CC cytostatic; cardiant; vasotropic; cerebroprotective ; nootropic; 

CC neuroprotective; antibacterial; virucide; fungicide; and 

CC ophthalmological . The polynucleotides and polypeptides can be used to 

CC prevent, treat or ameliorate a medical condition in e.g. humans, mice, 

CC rabbits, goats, horses, cats, dogs, chickens or sheep. They are also used 

CC in diagnosing a pathological condition or susceptibility to a 

CC pathological condition. Disorders which are diagnosed or treated include 

CC autoimmune diseases, hyperprolif erative disorders, cardiovascular 

CC disorders, cerebrovascular disorders, angiogenesis , nervous system 

CC disorders, infections caused by bacteria, viruses and fungi and ocular 

CC disorders. The polypeptides can also be used to aid wound healing and 

CC epithelial cell proliferation, to prevent skin aging due to sunburn, to 

CC maintain organs before transplantation, for supporting cell culture of 

CC primary tissues, to regenerate tissues and in chemotaxis. The 

CC polypeptides can also be used as a food additive or preservative to 

CC increase or decrease storage capabilities. AAC69446 to AAC69454 and 

CC AAB38202 represent sequences used in' the exemplification of the present 

CC invention. 

XX 

SQ Sequence 1559 BP; 236 A; " 525 C; 389 G; 405 T; 4 other; 



Query Match 7.7%; Score 32.6; DB 21; 

Best Local Similarity 51.7%; Pred. No. 2; 
Matches 74; Conservative 0; Mismatches 69; 



Length 1559; 

Indels 0; Gaps 0 



Qy 283 cactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaacgg 342 

I I I I I I I I I I I I II I I 1 I 1 II I I III II 

Db 1340 cactccagctgctgcttcaggacccagatgtcgtggctgctcacgctctcccaggcgctg 1399 

Qy 34 3 ctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccctcg 4 02 

I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I 

Db 14 00 ctctcgctcagggtgcgccgccgcctccccaccgaggagccagcgtcgctctcctcctcc 14 59 

Qy 403 ttctgtggtgcctctctcaacct 425 

MM II M I II I I 

Db 1460 ttctcctcctcccttccccacct 1482 



RESULT 2 
AAH99273 

ID AAH99273 standard; cDNA; 565 BP. 
XX 

AC AAH99273; 
XX 

DT 16-OCT-20O1 (first entry) 
XX 

DE Human protein encoding cDNA sequence SEQ ID NO: 108. 
XX 

KW Human; cancer; ulcer; HIV infection; human immunodeficiency virus; 

KW antiinflammatory; antirheumatic; antiarthritic; immunosuppressive; 

KW antibacterial; endocrine; cardiant; central nervous system; virucide; 

KW anti-HIV; fungicide; antimutagen; cardiovascular; antianaemic; anaemia; 

KW antiaggregant ; haemostatic; vulnerary; antiulcer; osteopathic; eczema; 

KW dermatological; antiallergic; antiasthmatic; antidiabetic; cytostatic; 

KW neuroprotective; antidepressant; nootropic; antiparkinsonian; infection; 

KW immunostimulant ; gene therapy; antisense therapy; vaccine; inflammation; 

KW ant ianaphylactic; rheumatoid arthritis; septic shock; pancreatitis; 

KW cardiac dysfunction; neuropathology; cardiac anaphylaxis; autoimmunity; 

KW genetic disease; haematopoietic disorder; platelet disorder; asthma; 

KW thrombocytopaenia; osteoporosis; severe combined immunodeficiency; 

KW allergic rhinitis; diabetes; multiple sclerosis; depression; 

KW Alzheimer's disease; Parkinson's disease; neurodegenerative disorder; 

KW neurological disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200153455-A2 . 
XX 

PD 26-JUL-2001. 

XX 

PF 22-DEC-2000; 2000WO-US35017 . 
XX 

PR 23-DEC-1999; 99US-0471275 . 

PR 21-JAN-2000; 2000US-0488725 . 

PR 25-APR-2000; 2000US-0552317 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu C, Drmanac RT; 
XX 

DR WPI; 2001-457603/49. 

DR P-PSDB; AAM25332. 



XX 

PT Isolated human polynucleotides encoding polypeptides, useful for the 

PT treatment and diagnosis of e.g. cancer, ulcers and HIV infection - 
XX 

PS Claim 1; Page 347; 1217pp; English. 
XX 

CC AAH99166 to AAH99904 encode the human proteins given in AAM25225 to 

CC AAM25963. The proteins can have activities based on the tissues and 

CC cells they are expressed in, such as: antiinflammatory; antirheumatic; 

CC antiarthritic; immunosuppressive; antibacterial; endocrine; cardiant; 

CC central nervous system; virucide; anti-HIV; fungicide; antimutagen; 

CC cardiovascular; antianaemic; antiaggregant ; haemostatic; vulnerary; 

CC antiulcer; osteopathic; dermatological; antiallergic; antiasthmatic; 

CC antidiabetic; cytostatic; neuroprotective; antidepressant; nootropic; 

CC antiparkinsonian; and immunostimulant . The proteins and polynucleotides 

CC encoding them can be used in gene therapy, antisense therapy and vaccine 

CC production, The proteins and polynucleotides are useful for screening for 

CC agonists or antagonists of a protein and for the treatment and diagnosis 

CC of disorders associated with the activity of a protein e.g. inflammation, 

CC rheumatoid arthritis, septic shock, pancreatitis, cardiac dysfunction, 

CC neuropathology, cardiac anaphylaxis, viral, bacterial, HIV and fungal 

CC infections, autoimmunity, genetic diseases, haematopoietic disorders, 

CC anaemia, platelet disorders, thrombocytopaenia, wounds, burns, ulcers, 

CC osteoporosis, severe combined immunodeficiency, eczema, allergic 

CC rhinitis, asthma, diabetes, cancer, multiple sclerosis, depression, 

CC Alzheimer's disease, Parkinson's disease, neurodegenerative and 

CC neurological disorders. 
XX 

SQ Sequence 565 BP; 69 A; 221 C; 126 G; 149 T; 0 other; 



Query Match 7.6%; Score 32.4; DB 22; Length 565; 

Best Local Similarity 56.6%; Pred. No. 1.5; 

Matches 60; Conservative 0; Mismatches 46; Indels 0; Gaps 0; 

Qy 320 tgcacacgctagccgattaacggctcacgctaccaggcgctctacgcggatgtgccccct 379 

I I I I II I I I III I I I I I I I I I I I III II 

Db 379 tgctcacgctctcccaggcgctgctctcgctcagggtgcgccgccgcctccccaccgagg 438 

Qy 380 agccagcttctctctccccctcgttctgtggtgcctctctcaacct 425 

I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I 

Db 439 agccagcgtcgctctcctcctctttctcctcctcccttccccacct 484 



RESULT 3 
AAD09494/C 

ID AAD09494 standard; DNA; 4360 BP. 
XX 

AC AAD09494; 
XX 

DT 10-SEP-2001 (first entry) 
XX 

DE Human SGP018 phosphatase polypeptide encoding DNA. 
XX 

KW Human; SGP018 phosphatase polypeptide; phosphatase-related disease; 

KW immune-related disorder; ocular disease; organ transplant rejection; 

KW infection; diabetes; pain; sexual dysfunction; Alzheimer's disease; 



KW metabolic disorder; haematopoietic cancer; mood disorder; cardiant; 

KW Parkinson's disease; multiple sclerosis; amyotrophic lateral sclerosis; 

KW cardiovascular disease; brain; neuronal-associated disease; dyskinesia; 

KW attention disorder; cognition disorder; psychotic disorder; cytostatic; 

KW neurological disorder; virucide; nootropic; cerebroprotective; therapy; 

KW neuroprotective; antibacterial; vulnerary; tranquilliser; antiasthmatic; 

KW hypotensive; immunosuppressive; antipsoriatic; analgesic; hypertensive; 

KW antifungal; dual specificity phosphatase; DSP; MAP kinase phosphatase; 

KW MKP; migraine; ds . 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 208.. 3609 

FT /*tag= a 

FT /product^ "Human SGP018 phosphatase polypeptide" 

XX 

PN WO200146394-A2 . 
XX 

PD 28-JUN-2001. 
XX 

PF 21-DEC-2000; 2000WO-US347 36 . 
XX 

PR 21-DEC-1999; 99US-0173255 . 

PR 28-DEC-1999; 99US-01757 66 . 

PR 25-JAN-2000; 2000US-0178078 . 

PR 31-JAN-2000; 2000US-0179301 . 
XX 

PA (SUGE-) SUGEN INC. 
XX 

PI Plowman GD, Martinez R, Whyte D, Manning G, Sudarsanam S, Hill RJ; 

PI Flanagan P; 

XX 

DR WPI; 2001-418058/44. 

DR P-PSDB; AAE04836. 
XX 

PT Novel phosphatase polypeptide useful for treating cancers, 

PT immune-related diseases and disorders, cardiovascular disease, brain or 

PT neuronal-associated diseases and metabolic disorders 

XX 

PS Claim 29; Fig 1; 186pp; English. 
XX 

CC The present invention relates to phosphatase polypeptides, nucleotide 

CC sequences encoding them, as well as various products and methods useful 

CC for the diagnosis and treatment of various phosphatase-related diseases 

CC and conditions. Substance that modulates the activity of phosphatase 

CC polypeptide is used to treat immune-related diseases and disorders, 

CC cardiovascular disease, brain or neuronal-associated diseases and 

CC metabolic disorders, including cancers of tissues, cancers of 

CC haematopoietic origin, diseases of central and peripheral nervous 

CC system, Alzheimer's disease, Parkinson's disease, multiple sclerosis, 

CC amyotrophic lateral sclerosis, viral infections, infections caused by 

CC prions, bacteria and fungi, ocular diseases, diabetes, migraines, pain, 

CC sexual dysfunction, mood disorders, attention disorders, cognition 

CC disorders, hypotension, hypertension, psychotic disorders, neurological 

CC disorders, dyskinesias and organ transplant rejection. The present 

CC sequence is a DNA encoding human SGP018 phosphatase polypeptide. This 



CC sequence is classified as dual specificity phosphatase (DSP) and MAP 

CC kinase phosphatase (MKP) . 

XX 

SQ Sequence 4360 BP; 1138 A; 1076 C; 1363 G; 783 T; 0 other; 



Query Match 7.6%; * Score 32.4; DB 22; Length 4360; 

Best Local Similarity 56.6%; Pred. No. 3.5; 

Matches 60; Conservative 0; Mismatches 46; Indels 0; Gaps 

Qy 320 tgcacacgctagccgattaacggctcacgctaccaggcgctctacgcggatgtgccccct 379 

M I I I I I I I III I I I II I I I I I I III II 

Db 1699 TGCTCACGCTCTCCCAGGCGCTGCTCTCGCTCAGGGTGCGCCGCCGCCTCCCCACCGAGG 164 0 

Qy 380 agccagcttctctctccccctcgttctgtggtgcctctctcaacct 425 

I I I I I I I I I I I 1 I I I I I I I I I I I II I I I I I I I 

Db 1639 AGCCAGCGTCGCTCTCCTCCTCCTTCTCCTCCTCCCTTCCCCACCT 1594 



RESULT 4 
AAF12056 

ID AAF12056 standard; cDNA; 545 BP. 
XX 

AC AAF12056; 
XX 

DT 13-MAR-2001 (first entry) 
XX 

DE Aspergillus oryzae EST SEQ ID NO: 4579. 
XX 

KW Multiple gene expression; filamentous fungal cell; EST; 

KW expressed sequence tag; Fusarium venenatum; Aspergillus niger; 

KW Aspergillus oryzae; Trichoderma reesei; identification; recombination; 

KW culture condition; environmental stress; spore morphogenesis; 

KW metabolic pathway engineering; catabolic pathway engineering; ss. 

XX 

OS Aspergillus oryzae. 
XX 

PN WO200056762-A2. 
XX 

PD 28-SEP-2000. 
XX 

PF 22-MAR-2000; 2000WO-US077 8 1 . 
XX 

PR 22-MAR-1999; 99US-0273623 . 
XX 

PA (NOVO ) NOVO NORDISK BIOTECH INC. 

PA (NOVO ) NOVO NORDISK AS. 

XX 

PI Berka RM, Rey MW, Shuster JR, Kauppinen S, Clausen IG, Olsen PB; 
XX 

DR WPI; 2000-594572/56. 
XX 

PT Monitoring differential expression of genes in filamentous fungal cells 

PT uses fluorescence-labeled nucleic acids isolated from the cells and a 

PT substrate of expressed sequence tags - 
XX 

PS Claim 88; Page 1949; 3161pp; English. 



XX 

CC The present invention describes a method for monitoring differential 

CC expression of genes in a first filamentous fungal (FF) cell relative to 

CC expression of the same genes in one or more second filamentous fungal 

CC cells. The method uses fluorescence-labeled nucleic acids isolated from 

CC the FF cells and a substrate of expressed sequence tags (EST) . The ESTs 

CC are used in the methods for monitoring differential expression of genes 

CC in a first filamentous fungal (FF) cell relative to expression of the 

CC same genes in one or more second filamentous fungal cells. Monitoring 

CC the global expression of genes from FF cells allows the production 

CC potential of the microorganisms to be improved. New genes may be 

CC discovered, possible functions of unknown open reading frames can be 

CC identified and gene copy number variation and stability can be 

CC monitored. The expression of genes can be used to study how FF cells 

CC adapt to changes in culture conditions, environmental stress, spore 

CC morphogenesis, recombination, metabolic or catabolic pathway 

CC engineering. Using ESTs provides several advantages over genomic or 

CC random cDNA clones including elimination of redundancy as one spot on an 

CC array equals one gene or open reading frame, and organisation of the 

CC microarrays based on function of the gene products to facilitate 

CC analysis of the results. AAF07478 to AAF11247 represents ESTs from 

CC Fusarium venenatum; AAF11248 to AAF11853 represents ESTs from Aspergillus 

CC niger; AAF11854 to AAF14878 represents ESTs from Aspergillus oryzae; and 

CC AAF14879 to AAF15337 represents ESTs from Trichoderma reesei, which are 

CC all specifically claimed in the present invention. 

XX 

SQ Sequence 545 BP; 137 A; 148 C; 131 G; 129 T; 0 other; 



Query Match 7.4%; Score 31.4; DB 21; Length 545; 

Best Local Similarity 57.7%; Pred. No. 3.1; 

Matches 56; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 

Qy 30 gacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtc 8 9 

II I I I I I I I I I I I I I I I I I I II I I I II I I I I III II 
Db 155 gagctgctcaagcagaagcagtactcccctatgtccgtttccgacatggtccctctcatc 214 

Qy 90 gtagctgctatggtctgtgtcatgtacaccacctcgg 126 

I M I I I I III II I I I I I I I I 
Db 215 ttcgctggtgtcaacggtcacctcgacaacatccccg 251 



RESULT 5 
AAQ20760 

ID AAQ20760 standard; DNA; 2075 BP. 
XX 

AC AAQ207 60; 
XX 

DT 16-APR-1992 (first entry) 
XX 

DE Rice light-harvesting chlorophyll a/b-combined protein gene. 
XX 

KW promoter; LHCP II; photosynthesis; ss. 
XX 

OS Oryza sativa. 
XX 

FH Key Location/Qualifiers 



T->rp 

r 1 




0 Oft . . 1 DDI 


FT 




/*tag= a 


FT 




/product= LHCP II 


FT 


promoter 


on o c i 
ZU . . 0 DO 


FT 




/*tag- b 


FT 


polyA signal 


1825 . . 1830 


T->rn 

r x 




/ * tag= c 


XX 






PN 


jpu / /zyi-A. 




XX 






r U 


u y-jjhL-i y y i . 




V V 

AA 






PF 


27-MAR-1990; 


90JP-0075774 . 


AA 






PR • 


27-MAR-1990; 


90JP-0075774 . 


vv 

AA 






irA 


(MITK ) MITSUI 


TOATSU CHEM INC. 


FA 


(NORQ ) NORINSHO KK. 


XX 






UK 


WPI; 1992-029693/04. 


XX 






r 1 


Photosynthesis 


-related gene, for new plant species - comprises 


P 1 


. DNA acid fragment contg. promoter of light-collecting 


nrn 

r 1 


chlorophyll-combined protein gene obtd. from rice plant 


V V 

AA 






D C 

ro 


Claim 3; Fig 1; 6pp; Japanese. 


VV 
AA 






pp 
LL 


LHCP II sequences were isolated from a cDNA library prepared from 




total RNA of 2 


-week old rice shoots. The library was probed by a 


cc 


17mer probe based on part of the (known) LHCP II coding sequence. 




A positive cDNA clone was then used to screen a rice genomic 


CC 


library. Four 


positive clones were identified and sequenced. The 


cc 


promoter region (tag = b) corresponds to nucleotides -785 to 


cc 


plus 59 using 


conventional nucleotide numbering, i.e. where 


cc 


transcription 


start site is plus 1. 


XX 






SQ 


Sequence 2075 


BP; 481 A; 610 C; 502 G; 482 T; 0 other; 



Query Match 7.2%; Score 30.4; DB 13; Length -2075; 

Best Local Similarity 47.0%; Pred. No. 11; 

94; Conservative 0; Mismatches 106; Indels 0; Gaps 0; 

cgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgt 91 
I I I I I I II I I II I I I II I I I I I I I I I I I I 



I I I I I I I I II I I I I I I I I I I I I I I 



III I I I I I I I I I I I I I I I I I I I I I I I 



Matches 


Qy 


32 


Db 


1490 


Qy 


92 


Db 


1550 


Qy 


152 


Db 


1610 


Qy 


212 



1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 1670 accgtagcttagcagtggtt 1689 



RESULT 6 
AAF74 625 

ID AAF74625 standard; cDNA; 3326 BP. 
XX 

AC AAF74625; 
XX 

DT 14-MAY-2001 (first entry) 
XX 

DE Human GLI-1 nucleotide sequence SEQ ID NO: 27. 
XX 

KW SUFUH; GLI-1; Sonic hedgehog-patched signalling pathway; cancer; 

KW cell differentiation; tissue development; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200112655-A1. 
XX 

PD 22-FEB-2001 . 
XX 

PF 14-AUG-2000; 2000WO-SE01576 . 
XX 

PR 13-AUG-1999; 99SE-00028 99 . 
XX 

PA (KARO-) KAROLINSKA INNOVATIONS AB . 
XX 

PI Toftgard R; 
XX 

DR WPI; 2001-211199/21. 
XX 

PT Novel peptides comprising fragments of two components of sonic 

PT hedgehog-patched signaling pathway, GLI-1 and SUFUH, useful for 

PT treating cancer and diseases influencing cell differentiation and 

PT tissue development 
XX 

PS Example 3; Page 111-112; 115pp; English. 
XX 

CC The present invention describes peptides consisting of fragments of GLI-1 

CC and SUFUH, respectively which are able to specif ically bind to SUFUH and 

CC GLI-1, respectively. GLI-1 and SUFUH are components which interact in the 

CC Sonic hedgehog (Shh) -patched (Ptch) signalling pathway. The present 

CC invention also describes: (1) DNA sequences encoding the peptides; and 

CC (2) a monoclonal antibody or an antibody fragment directed against the 

CC peptides. The peptides have cytostatic activity, and can be used as 

CC Shh-Ptch signalling pathway modulators. The peptides and monoclonal 

CC antibodies against them can be used for preparing a pharmaceutical 

CC composition for treating cancer. The peptides on contact with the GLI-1 

CC and SUFUH in vivo affects the Shh-Ptch signalling pathway which is used 

CC in the treatment of cancer. The peptides comprising the peptide fragments 

CC of the signalling pathway are also useful for treating other diseases 

CC influencing cell differentiation and tissue development. The present 

CC sequence represents the human GLI-1 nucleotide sequence, which is used 

CC in an example from the present invention. 
XX 

SQ Sequence 3326 BP; 715 A; 1087 C; 879 G; 645 T; 0 other; 



Query Match 7.1%; Score 30.2; DB 22; Length 3326; 

Best Local Similarity 53.9%; Pred. No. 16; 

Matches 62; Conservative 0; Mismatches 53; Indels 0; Gaps 

Qy 72 tgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaa 131 

II I I I I I I I I I I II I I I I I I I I I I I II I I I I I 
Db 1442 tggacgagggaccttgcattgctggcactggtctgtccactcttcgccgccttgagaacc 1501 

Qy 132 gcaggaaggagtggctacaactcgtacgaacctgatggaaggggtggatacaact 186 

I I I I I I I I I I I I I I I I I I I I I I III I I II I 
Db 1502 tcaggctggaccagctacatcaactccggccaatagggacccggggtctcaaact 1556 



RESULT 7 
AAD12302 

ID AAD12302 standard; cDNA; 3600 BP. 
XX 

AC AAD12302; 
XX 

DT 16-OCT-2001 (first entry) 
XX 

DE Human Cubitus interruptus (Ci) homologue, GLI-1 cDNA. 
XX 

KW Human; transgenic non-human animal; Cubitus interruptus; Ci; GLI-1; 

KW basal cell carcinoma; BCC model system; tumour; screening; 

KW anti-cancer; trichoepithelioma; cylindroma; trichoblastoma; ss, 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT CDS 79. .3399 

FT /*tag= a 

FT Vproduct= "Human Ci homologue, GLI-1" 

XX 

PN WO200156376-A1. 
XX 

PD 09-AUG-2001 . 
XX 

PF 02-FEB-2001; 2001WO-SE00204 . 
XX 

PR 03-FEB-2000; 2000SE-0000345 . 
XX 

PA (KARO-) KAROLINSKA INNOVATIONS AB . 
XX 

PI Toftgard R; 
XX 

DR WPI; 2001-488828/53. 
DR P-PSDB; AAE06644. 
XX 

PT Transgenic non-human animal useful as basal cell carcinoma model system 

PT to identify anti-cancer drug candidates, overexpresses transgene 

PT encoding GLI-1 protein which is a human homolog to Cubitus interruptus 

PT - 

XX 

PS Claim 6; Page 25-26; 33pp; English. 



XX 

CC The present invention relates to a transgenic non-human animal 

CC comprising a transgene containing a nucleic acid encoding a human 

CC Cubitus interruptus (Ci) homologue protein, GLI-1. The transgenic 

CC non-human animal is useful as basal cell carcinoma (BCC) model system 

CC since it overexpresses GLI-1 which leads to development of tumours 

CC resembling human BCC. Thus it is also useful for screening anti-cancer 

CC drug candidates and evaluating whether it affects BCC, 

CC trichoepitheliomas, cylindromas and trichoblastomas . The present 

CC sequence is a cDNA encoding human Ci homologue protein, GLI-1. 

XX 

SQ Sequence 3600 BP; 785 A; 1161 C; 949 G; 705 T; 0 other; 



Query Match 7.1%; Score 30.2; DB 22; Length 3600; 

Best Local Similarity 53.9%; Pred. No. 16; 

Matches 62; Conservative 0; Mismatches 53; Indels 0; Gaps 0; 

Qy 72 tgcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaa 131 

II I I I I I II I I I II I I I I I I I I I I I I I I I I I I 
Db 1520 tggacgagggaccttgcattgctggcactggtctgtccactcttcgccgccttgagaacc 157 9 

Qy 132 gcaggaaggagtggctacaactcgtacgaacctgatggaaggggtggatacaact 186 

II I I I I I I I I I I I I III I I I I I II I I I I I I 
Db 1580 tcaggctggaccagctacatcaactccggccaatagggacccggggtctcaaact 1634 



RESULT 8 




AAC44700 




ID 


AAC44700 standard; DNA; 5118 BP. 


XX 






AC 


AAC4 4 7 00; 




XX 






DT 


18-OCT-2000 


(first entry) 


XX 






DE 


Arabidopsis 


thaliana DNA fragment SEQ ID NO: 43802. 


XX 






KW 


Hybridisation assay; genetic mapping; gene expression control; 


KW 


protein identification; signal transduction pathway; 


KW 


metabolic pathway; promoter; termination sequence; ss. 


XX 






OS 


Arabidopsis 


thaliana . 


XX 






PN 


EP1033405-A2 . 


XX 






PD 


06-SEP-2000 




XX 






PF 


25-FEB-2000; 2000EP-03014 39 . 


XX 






PR 


25-FEB-1999 


99US-0121825. 


PR 


05-MAR-1999, 


99US-0123180. 


PR 


09-MAR-1999 


99US-0123548. 


PR 


23-MAR-1999 


99US-0125788. 


PR 


25-MAR-1999 


99US-0126264. 


PR 


2 9-MAR-1999 


99US-0126785. 
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99US-0128714 
99US-0129845 
99US-0130077 
99US-0130449 
99US-0130510 
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99US-0131449 
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PR 


1 o 

ly 


- JUL- 


1 AAA 

i y y y , 


Q A n c 

y yub 


A 1 /I A "3 /I 

-U14 4334 


PR 


t a 

ly 


- JUL- 


T AAA 

i y y y , 


A A n c 

y yus 


A "I A A O "J C 

-U14 4330 


PR 


20 


- JUL- 


T AAA 

i y yy , 


9 9us 


AT /t /I O C O 

-U14 43d2 


PR 


20 


- JUL- 


1 AAA 

i y yy , 


99US 


A 1 / /I H O A 

-01 4 4 b32 


PR 


20 


- JUL- 


1 AAA 

i y yy , 


99US 


AT A A O O /I 

-U144oo4 


PR 


2 1 


- JUL- 


1 AAA 

l y y y , 


A A n c 

y yus 


AT yi /I O T /I 

-U144ol4 


PR 


O 1 

2 1 


- JUL- 


1 A A A 

i y y y , 


A A n c 

y yus 


at yi cnoc 
-U14oUo b 


PR 


21 


- JUL- 


1999, 


99US 


AT /! d A O O 


PR 


o o 

22 


- JUL- 


1 AAA 

1999, 


99US 


at / n a o a 

-014 oUoo 


PR 


o o 

22 


- JUL- 


1 A A A 

1999 , 


99US 


AT A UL A O H 

-01 4o0o / 


PR 


22 


- JUL- 


T AAA 

1999, 


99US 


AT A C A O A 

-01 4 50b y 


PR 


22 


- JUL- 


T AAA 

1999, 


99US 


AT /CI AO 

-01 4 51 y2 


PR 


23 


-JUL- 


1999, 


99US 


AT A C T yi C 

-014 514 5 


PR 


23 


- JUL- 


T AAA 

1999, 


99US 


AT A C A T O 

-014 52 lo 


PR 


O O 

23 


- JUL- 


1 A A A 

i y y9, 


A A n C" 

99US 


AT >| COO/ 

-014 52 2 4 


PR 


2 6 


- JUL- 


T AAA 

1 9y y , 


99US 


AT A C A T (Z 

-014 52 / b 


PR 


O *7 


- JUL- 


1 A A A 

i y y y , 


y y ub 


AT yl C A T O 

-U14Dyi3 


PR 


27 


- JUL- 


T AAA 

1 9y y , 


99US 


AT A C A T O 

-0145yio 


PR 


O "7 

2 / 


- JUL- 


1 A A A 

i y y y , 


99US 


AT A C A T A 

-oi 4 5y i y 


PR 


o o 
2 o 


- JUL- 


1 A A A 

iy y y , 


y yub 


AT / CQC1 

-ui 4 oy oi 


PR 


a o 

02 


-AUG- 


1 A A A 

iy y y 


A A Tl O 

y yub 


AT A C "3 O C 

-Ul 4 DOOD 


PR 


02 


-AUG- 


1 AAA 

1999 


99US 


AT A /Z O O O 

-014 b3oo 


PR 


02 


-AUG- 


1 AAA 

1999 


99US 


AT A £ O O A 

-014 b3o 9 


PR 


03 


-AUG- 


-i n a a 

1999 


99US 


AT A 1 A O O 

-014 703o 


PR 


A A 

04 


-AUG- 


1 AAA 

1999 


99US 


AT /I "7 A A A 

-0147204 


PR 


04 


-AUG- 


1 AAA 

1999 


99US 


AT yi T O A A 

-0147302 


PR 


05 


-AUG- 


T AAA 

1999 


99US 


AT /IT 1 AO 

-0147iy2 


PR 


05 


-AUG- 


1 A A A 

1999 


99US 


AT Jl T O /* A 

-01472 60 


PR 


06 


-AUG- 


1999 


99US 


AT yi T O A O 

-0147303 


PR 


06 


-AUG- 


1999 


99US 


AT A "1 A T 

-014 7416 


PR 


09 


-AUG- 


1999 


99US 


AT Jl 1 A A O 

-01474 93 


PR 


09 


-AUG- 


1999 


99US 


AT A T A O C 

-0147935 


PR 


10 


-AUG- 


1999 


99US 


-014 8171 


PR 


11 


-AUG- 


1999 


99US 


-0148319 


PR 


12 


-AUG- 


1999 


99US 


-0148341 


PR 


13 


-AUG- 


1999 


99US 


-0148565 


PR 


13 


-AUG- 


1999 


99US 


-0148684 


PR 


16 


-AUG- 


1999 


99US 


-0149368 


PR 


17 


-AUG- 


1999 


99US 


-0149175 


PR 


18 


-AUG- 


1999 


99US 


-0149426 



DO 

PR 


o a 

ZU 


-AUb - 


1 QQQ 

l y y y 


, q one — 
yyub 


u i h y i zz 


DD 
PR 


o a 
z u 




1 QQQ 

i y y y 




u x h y i Zo 


rK 


o a 
zu 


7\ no 


1 QQQ 

i y y y 


q one _ 
yyub- 


u i h y y z y 


DD 

rK 


O "3 

Z o 


7\ no 


1 QQQ 

i y y y 


q one _ 

yyub- 




DD 

PR 


Z O 


7\ no 

-AUb- 


1 Q Q Q 

i y y y 


QQno 

yyub- 


m a q q "3 n 

ui^ y y ju 


DD 
PR 


O C 

ZD 


7\ TTO 

-AUb- 


1 Q Q Q 

i y y y j 


yyub- 


U1DUD DO 


DD 
PR 


Z O 


7\ no 


1 QQQ 

i y y y 


yyub- 


U 1 DU o o h 


DD 
PR 


Z I 


7\ no 
- AUb _ 


1 QQQ 

i y y y 


, QQnC 

yyub- 


U 1 D 1 U DD 


DD 
PR 


Z 1 




1 QQQ 

i y y y , 


yyub- 


ni ci n g. c, 

Ul JlUOD 


DD 
PR 


Z 1 


-AUb- 


1 QQQ 

i y y y j 


yyub- 


Ul jiUoU 


DD 
PR 




7\ no 


1 QQQ 

i y y y , 


• QQno 

yyub- 


A 1 C 1 "3 A "3 

Ul j1 jU j 


DD 
PR 


0 1 


7\ no 
-AUb~ 


1 QQQ 

i y y y , 


yyub- 


ni en a *3 o 
U 1 D 1 4 D 0 


DD 
PR 


A T 
U 1 


O T? D 

-bEP- 


1 QQQ 

i y y y , 


• Q Q n c 

yyub- 


m CI Q "3 A 

(j i di y^u 


DD 
PR 


a n 
U / 


C TP D 

-bEP- 


1 QQQ 

i y y y j 


• QQno 

yyub- 


A 1 C O "3 d *3 
U 1 DZ D DO 


PR 


10 


-SEP- 


t a a a 

1 9 y y 


99US- 


A "I C *D A T A 

U 1 ddU / u 


DD 
PR 


T "3 


-bEP- 


T Q G G 

i y y y , 


Q Q n o 

yyub- 


A 1 CQTCQ 
U 1 DO / DO 


DD 
PR 


Id 


-DEP- 


1 Q Q Q 

i y y y , 


yyub- 


A 1 c >I A 1 Q 

U 1 D4 U 1 o 


PR 


1 0 


OT? D 

-SEP- 


i y y y , 


yyub- 


AT C /I A *3 Q 

u i d4 Uo y 


D D 
PR 


o a 
Z U 


C T7 D 

-oEP- 


1 QQQ 

i y y y , 


yyub- 


A 1 C/n7Q 

U 1 D 4 / / y 


PR 


ZZ 


-SEP- 


t a a a 

i y y y / 


99US- 


A 1 CC1 *3 A 

U 1 DDlo 9 


DD 
PR 


O "3 
Z 0 


O T7 D 

-bEP- 


1 QQQ 

i y y y , 


i none 

yyub- 


A 1 C C /I Q C 

U 1 D D 4 oo 


DD 
PR 


O A 
Z 4 


-bEP- 


1 QQQ 

i y y y t 


> Q Q TT o 

yyub- 


A 1 CCCCO 
U 1 D D DD y 


DD 
PR 


O Q 

Z o 


-bEP- 


T G Q Q 

i y y y j 


• Q Q TT O 

yyub- 


A 1 C C /I C Q 

U 1 D 0 4 Do 


PR 


O Q 

z y 


O TP D 

-bEP~ 


i o. q a 

i y y y , 


• a one 

yyub- 


U ID DD y D 


PR 


a /i 
U4 


-OCT- 


i n o n 

i y y y j 


o one 

yyub- 


AT CT1 1 H 
U 1 D /ll / 


DD 

PR 


a c 
UD 


-UbT- 


1 QQQ 

i y y y , 


Q Q n o 

yyub- 


U 1 D / / DO 


DD 
PR 


a c 
U b 


-Ub 1 - 


1 QQQ 

i y y y , 


q Qno 

yyub- 


A 1 CTQCC 
Ul J / ODD 


DD 
PR 


a i 
U / 


OOT 

-Ub 1 - 


1 QQQ 

i y y y , 


• QQno 

yyub- 


A 1 CQfiOQ 

u i do uz y 


DD 
PR 


a d 
Uo 


-Ub 1 - 


1 QQQ 

i y y y , 


QQno 

yyub- 


A 1 CQO'JO 
U 1 D O Z OZ 


DD 
PR 


1 o 
LZ 


-Ub i - 


1 QQQ 

i y y y , 


. QQno 

yyub- 


A 1 CQOCQ 

u i do o d y 


DD 
PR 


1 J 


Aprp 

-Ub I - 


1 Q Q Q 

i y y y , 


yyub- 


AT C Q 0 Q "3 

u i d y z y o 


DD 
PR 


1 "3 
1 O 


-Ub I - 


1 Q Q Q 

i y y y , 


. QQno 

yyub- 


A 1 CQOQ/ 

u i d yz y 4 


DD 
PR 


1 "3 
1 0 


O.OT 

-Ub 1 - 


1 QQQ 

i y y y j 


i QQno 

yyub- 


A 1 CQOQC 

u i d y z y d 


PR 


T /I 

1 4 


-OCT- 


i o o n 

i y y y i 


n n n o 

yyub- 


AT C A ^ o n 

u i d y oz y 


PR 


14 


-OCT- 


l y y y , 


99US- 


AT C O "3 O A 

U 1 D 9 ooU 


PR 


14 


-OCT- 


i y y y , 


99US- 


AT C0001 

u i d y ooi 


PR 


14 


-OCT- 


i o d n 

ly y y , 


99US- 


U1d9do / 


PR 


1 A 

14 


-OCT- 


i o o n 

i y y y j 


99US- 


ni c o (Z "3 O 

Ul j!? b jo 


PR 


T O 

lo 


-OCT- 


i o o. n 

ly 99 j 


99US- 


AT COCO/ 

U1d9do4 


PR 


2 1 


-OCT- 


1 o o o 

1999 j 


99US- 


O T /CAT il 1 

01607 4 1 


PR 


a i 
21 


-OCT- 


i n n n 

1999 


99US- 


01 60 / D / 


DD 
PR 


Z 1 


-ub r- 


1 Q Q Q 

i y y y i 


Q Q n o 

yyub- 


AT CATCQ 
Ul DU / do 


PR 


Zl 


-OCT- 


t n n n 
1 9 99j 


99US- 


AT /T A T T A 
01 60 / / U 


DD 
PR 


O 1 

Z 1 


/~\0 T 1 

-Ub 1 - 


1 Q Q Q 

i y y y , 


Q Q n o 

yyub- 


AT CAD1 / 

UlbUoi 4 


PR 


A 1 

zl 


-OCT- 


t n o n 
1999 j 


99US- 


AT /"AO"! C 

01 60o 1 D 


PR 


a o 
ZZ 


-OCT- 


i n n n 
1 9 9 9 , 


Q Q n o 

yyub- 


AT d A Q O A 

ui du y O U 


PR 


ZZ 


-OCT- 


i n o n 
1 9 9 9 j 


99US- 


AT ^ A O O T 

0 1 60 9 o 1 


PR 


ZZ 


-OCT- 


i o n n 

1999 j 


99US- 


AT r A O O O 

01609o 9 


PR 


D C 

Zd 


-OCT- 


i o o n 

1999 


99US- 


AT C T / A / 

01614 04 


PR 


Zb 


-OCT- 


t n n n 

1 999 1 


99US- 


AT ^ T /I A C 

01614 Od 


PR 


o c 
ZD 


-OCT- 


1 D O Cl 

1 9y 9 


Q Q n o 

99Ub- 


AT C1 /AC 

01614 0 6 


PR 


26 


-OCT- 


1999 


99US- 


0161359 


PR 


26 


-OCT- 


1999 


99US- 


0161360 


PR 


26 


-OCT- 


1999 


99US- 


0161361 


PR 


28 


-OCT- 


1999 


99US- 


0161920 


PR 


28 


-OCT- 


1999 


99US- 


0161992 


PR 


28 


-OCT- 


1999 


99US- 


0161993 



PR 29-OCT-1999; 99US-0162142 . 



Query Match 7.1%; Score 30; DB 21; Length 5118; 

Best Local Similarity 55.9%; Pred. No. 22; 

Matches 57; Conservative 0; Mismatches 45; Indels 0; Gaps 0; 

Qy 43 ggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtagctgctatgg 102 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I Mill 
Db 2334 ggaagctaagcttcctgatgcccgacctttgattaatgtctgtgatcgttttggctttgt 2393 

Qy 103 tctgtgtcatgtacaccacctcggcacaagcaggaaggagtg 144 

I III II I I I I I I I I I I I I II II 
Db 2394 acctgatcttactcattacctctacacaaacaacatgctgcg 2435 



RESULT 9 
AAQ14640 

ID AAQ14640 standard; DNA; 1229 BP. 
XX 

AC AAQ14640; 
XX 

DT 30-JAN-1992 (first entry) 
XX 

DE Plasmid pGB18ARR insert encoding a human temporal lobe PDE . 
XX 

KW brain; pRATDPD; cAMP; phosphodiesterase; complemetation analysis; ss. 
XX 

OS Homo sapiens . 
XX 

PN W09116457-A. 
XX 

PD • 31-OCT-1991. 
XX 

PF 19-APR-1991; 91WO-US027 1 4 . 
XX 

PR 20-APR-1990; 90US-05117 15 . 
XX 

PA (COLD-) COLD SPRING HARBOR. 
XX 

PI Wigler MH, Colicelli JJ; 
XX 

DR WPI; 1991-339841/46. 
XX 

PT Complementary screening for genes and prods. - e.g. RAS protein 

PT and cAMP, that modify, complement or suppress genetic defect and 

PT correct associated phenotypic alteration 
XX 

PS Example 2; Page* 124; 169pp; English. 
XX 

CC Plasmid pRATDPD was isolated from a rat brain cDNA library. It is 
CC ' thought to encode a cyclic nucleotide PDE. The RATDPD cDNA was used 

CC as a probe to isolate plasmid pGB18ARR by low stringency screens of 

CC a human temporal lobe cDNA library. The inventors have classified 

CC GB18ARR in cAMP-specif ic PDE class IV2 along with TM3 (AAQ14630) . 
XX 

SQ Sequence 1229 BP; 281 A; 360 C; 341 G; 247 T; 0 other; 



Query Match 7.0%; Score 29.8; DB 12; 

Best Local Similarity 51.9%; Pred. No. 14; 
Matches 67; Conservative 0; Mismatches 62; 



Length 1229; 

Indels 0; Gaps 0; 



Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I 

Db 84 6 cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 905 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 906 agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 965 

Qy 401 cgttctgtg 409 

I I I I I I 

Db 966 cctcctctg 974 

RESULT 10 
AAT34381 

ID AAT34381 standard; cDNA; 1230 BP. 
XX 

AC AAT34381; 
XX 

DT 10-OCT-1996 (first entry) 
XX 

DE Plasmid pGB18ARR insert. 
XX 

KW Human; glioblastoma cell; plasmid; mammalian; complementation; pPDE2RR; 

KW probe; yeast; pPDET; pPDElOX inv; temporal lobe; cDNA library; pRATPDP; 

KW pTM72; pGB14; pGB18ARR; pTM3; pJC44x; pGB25; phosphodiesterase family IV; 

KW pPDE18; pPDE21; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 3.. 1157 

FT /*tag= a 

FT /product= cAMP phosphodiesterase 
XX 

PN US5527896-A. 
XX 

PD 18-JUN-1996. 
XX 

PF 20-APR-1990; 90US-05117 15 . 
XX 

PR 19-APR-1991; 91US-0688352 . 

PR 20-APR-1990; 90US-05117 15 . 
XX 

PA (COLD-) COLD SPRING HARBOR LAB. 
XX 

PI Colicelli JJ, Wigler MH; 
XX 

DR WPI; 1996-299902/30. 

DR P-PSDB; AAW00097. 
XX 

PT DNA mols. isolated from human glioblastoma cells - encode 



PT RAS-related or cyclic nucleotide phosphodiesterase proteins 
XX 

PS Claim 5; Column 125-128; lOlpp; English. 
XX 

CC The sequences given in AAT34377-84 represent plasmid fragments which 

CC were isolated by hybridisation with mammalian genes cloned by 

CC complementation. These sequences were isolated using probes derived 

CC from the sequences given in AAT34366-76 which were cloned via 

CC complementation in yeast. Plasmids pPDET, pPDElOX inv and pPDE2RR were 

CC isolated by low stringency hybridisation screens of a human temporal 

CC lobe cDNA library using the pRATPDP insert as a probe. Comparison of 

CC the nucleotide sequences given in AAT34377-79 indicated that the inserts 

CC are representatives of the same genetic locus as the insert in pTM72. 

CC Plasmids pGB14 and pGB18ARR were obtained in the same manner. DNA 

CC sequence analysis revealed that they are representatives of the same 

CC genetic locus as the inserts in pTM3 and pJC44x. Plasmid pGB25 was also 

CC obtained at low stringency hybridisation using the pRATDPD insert as a 

CC probe. Judged by its nucleotide and deduced amino acid sequence it 

CC represents a novel member of the phosphodiesterase family IV. The cDNA 

CC insert of pBG25 was used as a probe to obtain pPDE18 and pPDE21. The 

CC cDNA of pPDE18 represents the same locus as that of pGB25 and contains 

CC more sequence information than the pGB25 cDNA. The pPDE21 insert 

CC represents a fourth member of phosphodiesterase family IV. The 

CC assignment to family IV is based solely on sequence relationships. 

XX 

SQ Sequence 1230 BP; 281 A; 360 C; 341 G; 247 T; 1 other; 



Query Match 7.0%; Score 29.8; DB 17; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 14; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 340 

I I II I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 

Db 847 cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 906 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 

I II I I I I I I I I I I Mil I I I I I I I I I I I I 
Db 907 agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 966 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 967 cctcctctg 975 



RESULT 11 
AAZ32251 

ID AAZ32251 standard; cDNA; 1230 BP. 
XX 

AC AAZ32251; 
XX 

DT 19-JAN-2000 (first entry) 
XX 

DE Human temporal lobe phosphodiesterase pGB18ARR encoding cDNA. 
XX 

KW Phosphodiesterase; dunce-like phosphodiesterase; PDE; DPD; cAMP; 
KW RAS-related protein; immunoreactive; detection; genetic defect; 



KW bronchodilation; increased myocardial contractility; 

KW anti-inflammation; ss. 

XX 

OS Homo sapiens. 
XX 

PN US5977305-A. 
XX 

PD 02-NOV-1999. 
XX 

PF 07-JUN-1995; 95US-04 7 4 37 9 . 
XX 

PR 01-MAR-1994; 94US-020 6188 . 

PR 20-APR-1990; 90US-051 17 15 . 

PR 19-APR-1991; 91US-0 688352 . 
XX 

PA (COLD-) COLD SPRING HARBOR LAB. 
XX 

PI Colicelli JJ, Wigler MH; 
XX 

DR WPI; 1999-619709/53. 

DR P-PSDB; AAY49817. 
XX 

PT New isolated RAS-related polypeptides and mammalian cyclic nucleotide 

PT phosphodiesterases, used for screening for agents which can modify 

PT complement or suppress genetic defects 
XX 

PS Example 2; Column 133-138; 145pp; English. 
XX 

CC The present invention describes new isolated RAS-related polypeptides 

CC and mammalian cyclic nucleotide phosphodiesterases (PDEs) . RAS-related 

CC polypeptides are capable of complementing a defective RAS function in 

CC yeast. The products can be used for screening for agents which can 

CC modify, complement or suppress a genetic defect in a biochemical 

CC pathway in which cAMP participates, or in a biochemical pathway which 

CC is controlled, directly or indirectly, by a RAS protein and other 

CC proteins affecting cell growth and maintenance. Developing agents that 

CC will selectively act upon PDEs is directed toward reproducing the 

CC desirable effects of cyclic nucleotides, e.g. bronchodilation, 

CC increased myocardial contractility, anti-inflammation, yet without 

CC causing the undesirable effects, e.g. increased heart rate or enhanced 

CC lipolysis. The products can also be used for therapeutic, diagnostic 

CC and prognostic uses. AAZ32229 to AAZ32285, and AAY49803 to AAY49830, 

CC represent sequences used in the exemplification of the present 

CC invention. 

XX 

SQ Sequence 1230 BP; 281 A; 360 C; 341 G; 247 T; 1 other; 



Query Match 7.0%; Score 29.8; DB 20; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 14; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I 

Db 847 cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 906 



Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 



I ii in; i 1 1 i i i i i M 1 1 1 1 1 i i 1 1 1 1 1 

Db 907 agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 966 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 967 cctcctctg 975 



RESULT 12 


AAA88186 


ID 


AAA88186 standard; cDNA; 1230 BP. 


XX 




AC 


AAA8 818 6; 


XX 




DT 


14-DEC-2000 (first entry) 


XX 




DE 


pGB18ARR human temporal lobe insert nucleotide sequence SEQ ID NO: 39. 


XX 




KW 


Detection; mammalian gene ; yeast ; microorganism; identification; 


KW 


phenotype; characteristic; dunce-like phosphodiesterase; PDE ; RAb; 


KW 


RAS-related protein; genetic defect; hybridisation; probe; ss. 


XX 




OS 


Homo sapiens . 


OS 


Synthetic . 


XX 




PN 


US6100025-A. 


XX 




PD 


08-AUG-2000 . 


XX 




PF 


01-MAR-1994 ; 94US-0206188. 


XX 




PR 


20-APR-1990; 90US-0511715 . 


PR 


19-APR-1991; 91US-0 688352 . 


XX 




PA 


(COLD-) COLD SPRING HARBOR LAB. 


XX 




PI 


Colicelli JJ, Wigler MH ; 


XX 




DR 


WPI; 2000-531664/4 8 . 


DR 


P-PSDB; AAB20628. 


XX 




PT 


Novel isolated DNA encoding a mammalian cyclic nucleotide 


PT 


phosphodiesterase is present in plasmids pPDE46, pPDE43 or pPDE339 and 


PT 


is used to modify a genetic defect in a biochemical pathway in which 


PT 


cAMP participates 


XX 




PS 


Example 2; Column 139-144; 145pp; English. 


XX 




CC 


The present invention describes a purified and isolated DNA (I) which 


CC 


encodes a mammalian cyclic nucleotide phosphodiesterase and is an insert 


CC 


present in the plasmids pPDE46 (ATCC 69552), pPDE43 (ATCC 69551) or 


CC 


pPDE339 (ATCC 69550) . The DNA molecules are used to modify, complement 


CC 


or suppress a genetic defect in a biochemical pathway in which cAMP 


CC 


participates and are also used as hybridisation probes. The present 


CC 


invention also describes methods for detecting mammalian genes encoding 


CC 


proteins which can function in microorganisms, particularly yeast, to 


CC 


modify, complement , or suppress a genetic defect associated with an 



CC identifiable phenotypic alteration or characteristic in the 

CC microorganism. AAA88162 to AAA88218 and AAB29614 to AAB20640 represent 

CC sequences used in the exemplification of the present invention. 

XX 

SQ Sequence 1230 BP; 281 A; 360 C; 341 G; 247 T; 1 other; 



Query Match 7.0%; Score 29.8; DB 21; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 14; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 II 1 11111 II 1 II 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 
cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 


340 


Db 


847 


906 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 


400 


Db 


907 


966 


Qy 


401 


cgttctgtg 409 
1 1 II 1 1 
cctcctctg 975 




Db 


967 





RESULT 13 


AAZ32285 


ID 


AAZ32285 standard; cDNA; 1481 BP. 


XX 




AC 


AAZ32285; 


XX 




DT 


19-JAN-2000 (first entry) 


XX 




DE 


Nucleotide sequence SEQ ID NO: 87. 


XX 




KW 


Phosphodiesterase; dunce-like phosphodiesterase; PDE; DPD; cAMP; 


KW 


RAS-related protein; immunoreactive; detection; genetic defect; 


KW 


bronchodilation; increased myocardial contractility; 


KW 


anti-inflammation; ss. 


XX 




OS 


Unidentified. 


XX 




PN 


US5977305-A. 


XX 




PD 


02-NOV-1999. 


XX 




PF 


07-JUN-1995; 95US-0474379 . 


XX 




PR 


01-MAR-1994; 94US-0206188 . 


PR 


20-APR-1990; 90US-0511715 . 


PR 


19-APR-1991; 91US-0688352 . 


XX 




PA 


(COLD-) COLD SPRING HARBOR LAB. 


XX 




PI 


Colicelli JJ, Wigler MH; 


XX 




DR 


WPI; 1999-619709/53. 


DR 


P-PSDB; AAY49830. 



XX 

PT New isolated RAS-related polypeptides and mammalian cyclic nucleotide 

PT phosphodiesterases, used for screening for agents which can modify 

PT complement or suppress genetic defects 
XX 

PS Disclosure; Column 211-214; 145pp; English. 
XX 

CC The present invention describes new isolated RAS-related polypeptides 

CC and mammalian cyclic nucleotide phosphodiesterases (PDEs) . RAS-related 

CC polypeptides are capable of complementing a defective RAS function in 

CC yeast. The products can be used for screening for agents which can 

CC modify, complement or suppress a genetic defect in a biochemical 

CC pathway in which cAMP participates, or in a biochemical pathway which 

CC is controlled, directly or indirectly, by a RAS protein and other 

CC proteins affecting cell growth and maintenance. Developing agents that 

CC will selectively act upon PDEs is directed toward reproducing the 

CC desirable effects of cyclic nucleotides, e.g. bronchodilat ion, 

CC increased myocardial contractility, anti-inflammation, yet without 

CC causing the undesirable effects, e.g. increased heart rate or enhanced 

CC lipolysis. The products can also be used for therapeutic, diagnostic 

CC and prognostic uses. AAZ32229 to AAZ32285, and AAY49803 to AAY49830, 

CC represent sequences used in the exemplification of the present 

CC invention. 

XX 

SQ Sequence 1481 BP; 349 A; 430 C; 389 G; 313 T; 0 other; 



Query Match 7.0%; Score 29.8; DB 20; Length 1481; 

Best Local Similarity 51.9%; Pred. No. 15; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 
Db 


281 
957 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 
1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 


340 
1016 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 
agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 


400 


Db 


1017 


1076 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 
cctcctctg 1085 




Db 


1077 





RESULT 14 
AAT34373 

ID AAT34373 standard; cDNA; 2702 BP. 
XX 

AC AAT34373; 
XX 

DT 09-OCT-1996 {first entry) 
XX 

DE Plasmid pJC44x (ATCC 68603) insert. 
XX 

KW Human; glioblastoma; RAS-related protein; cell line U118MG; pJC265; 

KW yeast expression vector; pADNS; pADANS; fusion protein; rat pRATDPD; 

KW alcohol dehydrogenase protein; heat shock sensitivity; S. cerevisiae; 



KW TK161-R2V; pJC44x; pJC99; pJC310; cAMP phosphodiesterase; ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1..2701 

FT /*tag= a 

FT /product= cAMP phosphodiesterase 
XX 

PN US5527896-A. 
XX 

PD 18-JUN-1996. 
XX 

PF 20-APR-1990; 90US-051 17 15 . 
XX 

PR 19-APR-1991; 91US-0688352 . 

PR 20-APR-1990; 90US-051 17 15 . 
XX 

PA (COLD-) COLD SPRING HARBOR LAB . 
XX 

PI Colicelli JJ, Wigler MH; 
XX 

DR WPI; 1996-299902/30. 

DR P-PSDB; AAW00091. 
XX 

PT DNA mols. isolated from human glioblastoma cells - encode 

PT RAS-related or cyclic nucleotide phosphodiesterase proteins 
XX 

PS Claim 4; Column 39-44; lOlpp; English. 
XX 

CC This sequence represents a plasmid fragment which contains a human 

CC glioblastoma cell cDNA inserts encoding a cAMP phosphodiesterase. The 

CC cDNA was derived from the human glioblastoma cell line U118MG and 

CC transferred into two yeast expression vectors, pADNS and pADANS . 

CC Plasmid pADANS differs from pADNS in that the mRNA transcribed will 

CC direct the synthesis of a fusion protein including an N-terminal 

CC portion derived from the alcohol dehydrogenase protein, and the 

CC remainder from the mammalian cDNA insert. The two cDNA expression 

CC libraries created were screen for cDNA's capable of correcting the 

CC heat shock sensitivity of the S. cerevisiae host TK161-R2V. Four 

CC different inserts contained in plasmids pJC44x, pJC99, pJC265 and 

CC pJC310 (see also AAT34366-68) were discovered. The insert of pJC44x 

CC was shown to be homologous to the rat pRATDPD gene and biochemical 

CC analysis of the cellular lysates demonstrated that it encoded a cAMP 

CC phosphodiesterase. The inserts of pJC99, pJC265 and pJC310 showed no 

CC significant homology to previous isolated genes. - 
XX 

SQ Sequence 2702 BP; 574 A; 887 C; 757 G; 484 T; 0 other; 



Query Match 7.0%; Score 29.8; DB 17; Length 2702; 

Best Local Similarity 51.9%; Pred. No. 20; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 340 

I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 

Db 2534 cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 2593 



Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 

I II I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 2594 agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 2 653 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 2654 cctcctctg 2662 



RESULT 15 
AAZ32236 

ID AAZ32236 standard; cDNA; 2702 BP. 
XX 

AC AAZ32236; 
XX 

DT 19-JAN-2000 (first entry) 
XX 

DE Human glioblastoma cell cAMP phosphodiesterase pJC44x encoding cDNA. 
XX 

KW Phosphodiesterase; dunce-like phosphodiesterase; PDE; DPD; cAMP; 

KW RAS-related protein; immunoreactive; detection; genetic defect; 

KW bronchodilation; increased myocardial contractility; 

KW anti-inflammation; ss. 
XX 

OS Homo sapiens. 
XX 

PN US5977305-A. 
XX 

PD 02-NOV-1999. 
XX 

PF 07-JUN-1995; 95US-0474379 . 
XX 

PR 01-MAR-1994; 94US-0206188 . 

PR 20-APR-1990; 90US-051 17 15 . 

PR 19-APR-1991; 91US-0688352 . 
XX 

PA (COLD-) COLD SPRING HARBOR LAB. 
XX 

PI Colicelli JJ, Wigler MH; 
XX 

DR WPI; 1999-619709/53. 

DR P-PSDB; AAY49804. 
XX 

PT New isolated RAS-related polypeptides and mammalian cyclic nucleotide 

PT phosphodiesterases, used for screening for agents which can modify 

PT complement or suppress genetic defects 
XX 

PS Example 1; Column 53-59; 145pp; English. 
XX 

CC The present invention describes new isolated RAS-related polypeptides 

CC and mammalian cyclic nucleotide phosphodiesterases (PDEs) . RAS-related 

CC polypeptides are capable of complementing a defective RAS function in 

CC yeast. The products can be used for screening for agents which can 

CC modify, complement or suppress a genetic defect in a biochemical 

CC pathway in which cAMP participates, or in a biochemical pathway which 

CC is controlled, directly or indirectly, by a RAS protein and other 



CC proteins affecting cell growth and maintenance. Developing agents that 

CC will selectively act upon PDEs is directed toward reproducing the 

CC desirable effects of cyclic nucleotides, e.g. bronchodilation, 

CC increased myocardial contractility, anti-inflammation, yet without 

CC causing the undesirable effects, e.g. increased heart rate or enhanced 

CC lipolysis. The products can also be used for therapeutic, diagnostic 

CC and prognostic uses. AAZ32229 to AAZ32285, and AAY49803 to AAY49830, 

CC represent sequences used in the exemplification of the present 

CC invention. 

XX 

SQ Sequence 2702 BP; 574 A; 887 C; 757 G; 484 T; 0 other; 



Query Match 7.0%; Score 29.8; DB 20; Length 2702; 

Best Local Similarity 51.9%; Pred..No. 20; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 
cacatccgcactcccagctcctggtggcggggggtcaggtggagaccctacctgatcccc 


340 


Db 


2534 


2593 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 1 1 1 1 1 1 ■! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
agacctctgtccctgttcccctccactcctcccctcactcccctgctcccccgaccacct 


400 


Db 


2594 


2653 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 
cctcctctg 2662 




Db 


2654 





Search completed: February 7, 2002, 11:00:01 
Job time: 4987 sec 



GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic 



nucleic search, using sw model 



Run on : 



February 7, 2002, 10:51:46 ; Search time 172.96 Seconds 

(without alignments) 
556.505 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-394-745-6332 
425 

1 cggacgcgtgggtgcaattt 



tgtggtgcctctctcaacct 4 25 



Scoring table: 



IDENTITY_NUC 
Gapop 10.0 , 



Gapext 1.0 



Searched: 



351203 seqs, 113238999 residues 



Total number of hits satisfying chosen parameters: 



702406 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Issued_Patents_NA: * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq: * 

2 : /cgn2_6/ptodata/2/ina/5B_COMB . seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB. seq: * 

5 : /cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 

6 : /cgn2_6/ptodata/2/ina/backf ilesl . seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 





No . 


Score 


Match 


Length 


DB 


ID 


Description 






1 


29.8 


7 


0 


1229 


5 


PCT-US91-027 14-38 


Sequence 


38, 


Appl 




2 


29.8 


7 


0 


1230 


1 


US -07-688-352 C- 39 


Sequence 


39, 


Appl 




3 


29.8 


7 


0 


1230 


2 


US-08-4 7 4-37 9C-39 


Sequence 


39, 


Appl 




4 


29.8 


7 


0 


1230 


3 


US- 09- 14 6-2 4 9A-39 


Sequence 


39, 


Appl 




5 


29.8 


7 


0 


1230 


3 


US-08-206-188B-39 


Sequence 


39, 


Appl 




6 


29.8 


7 


0 


1481 


2 


US-08-47 4-37 9C-87 


Sequence 


87, 


Appl 




7 


29.8 


7 


0 


2702 


1 


US -07-688-352 C- 11 


Sequence 


11, 


Appl 




8 


29.8 


7 


0 


2702 


1 2 


US-08-4 74-37 9C-11 


Sequence 


11, 


Appl 




o 

y 


on o 

A 9 . o 


7 


0 


o -7 r\ o 




US-Uy-14b-^4 yA-11 


Sequence 


11, 


Appl 




10 


29.8 


7 


0 


2702 


3 


US-08-206-188B-11 


Sequence 


11, 


Appl 




11 


29.8 


7 


0 


3131 


1 


US-07-688-352C-21 


Sequence 


21, 


Appl 




12 


29.8 


7 


0 


3131 


3 


US-09-146-249A-21 


Sequence 


21, 


Appl 




13 


29.8 


7 


0 


3131 


3 


US-08-206-188B-21 


Sequence 


21, 


Appl 




14 


29.8 


7 


0 


3131 


5 


PCT-US91-02714-20 


Sequence 


20, 


Appl 




15 


29.8 


7 


0 


3705 


2 


US-08-474-379C-64 


Sequence 


64, 


Appl 




16 


29.8 


7 


0 


3705 


3 


US-09-146-249A-64 


Sequence 


64, 


Appl 




17 


29.8 


7 


0 


3705 


3 


US-08-20.6-188B-64 


Sequence 


64, 


Appl 




• 18 


29.6 


7 


0 


740 


4 


US-09-020-956-17 


Sequence 


17, 


Appl 




19 


29.6 


7 


0 


740 


4 


US-09-030-607-17 


Sequence 


17, 


Appl 




20 


28.8 


6 


8 


4550 


4 


US-09-103-663-35 


Sequence 


35, 


Appl 




21 


28.6 


6 


7 


498 


4 


US-09-037-990B-6 


Sequence 


6, 


Appli 




22 


28.6 


6 


7 


556 


4 


US-09-037-990B-7 


Sequence 


7, 


Appli 




23 


28.4 


6 


7 


986 


2 


US-08-713-825-2 


Sequence 


2, 


Appli 




24 


28.4 


6 


7 


986 


3 


US-09-199-842-2 


Sequence 


2, 


Appli 




25 


28 


6 


6 


1897 


1 


US-08-453-477-1 


Sequence 


1, 


Appli 




26 


28 


6 


6 


1897 


1 


US-08-453-461-1 


Sequence 


1, 


Appli 




27 


28 


6 


6 


7218 


1 


US-08-232-463-14 


Sequence 


14, 


Appl 


c 


28 


27.6 


6 


5 


751 


4 


US-09-020-956-12 


Sequence 


12, 


Appl 


c 


29 


27.6 


6 


5 


751 


4 


US-09-030-607-12 


Sequence 


12, 


Appl 




30 


27.6 


6 


5 


3530 


3 


US-08-704-711A-10 


Sequence 


10, 


Appl 


c 


31 


27.4 


6 


4 


801 


4 


US-09-020-956-16 


Sequence 


16, 


Appl 


c 


32 


27.4 


6 


4 


801 


4 


US-09-030-607-16 


Sequence 


16, 


Appl 




33 


27.4 


6 


4 


933 


3 


US-08-808-148-2 


Sequence 


2, 


Appli 




34 


27.4 


6 


4 


1289 


4 


US-09-020-956-111 


Sequence 


111 


, App 




35 


27.4 


6 


4 


1289 


4 


US-09-030-607-111 


Sequence 


111 


t App 



c 38 

c 39 

c 40 

c 41 

c 42 

c 43 

c 44 

c 45 



36 
37 



27.4 
27.4 
27 .2 
27.2 
27.2 
27 .2 
27 .2 
27 .2 
27.2 
27.2 



6.4 
6.4 
6.4 
6.4 
6.4 
6.4 
6.4 
6.4 
6.4 
6.4 



16075 
16075 
1557 
1557 
1873 
1873 
2870 
2870 
15328 
15328 



3 
3 
3 
4 
3 
4 
1 
2 
2 
5 



US-09-096-942-1 

US-09-096-867-1 

US-09-329-418-2 

US-09-531-914-2 

US-09-329-418-1 

US-09-531-914-1 

US-08-468-036-28 

US-08-376-843-28 

US-08-888-497-33 



PCT-US94-07926-33 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



1, Appli 

1, Appli 

2, Appli 
2, Appli 
1, Appli 
1, Appli 
28, Appl 
28, Appl 
33, Appl 
33, Appl 



ALIGNMENTS 



RESULT 1 
PCT-US91-02714-38 

; Sequence 38, Application PC/TUS91027 14 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 55 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & 

ADDRESSEE: Bicknell 

STREET: Two First National Plaza, 20 South Clark 

STREET: Street o 

CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP : 60603 
COMPUTER READABLE FORM: ' 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US 91 /027 1 4 

FILING DATE: 19910419 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 
; NAME: Borun, Michael F. 

REGISTRATION NUMBER: 25447 

REFERENCE/DOCKET NUMBER: 27805/30197 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 346-5750 

TELEFAX: (312) 984-9740 

TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 38: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1229 base pairs 

TYPE: NUCLEIC ACID 



STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
PCT-US91-02714-38 



Query Match 7.0%; Score 29.8; DB 5; Length 1229; 

Best Local Similarity 51.9%; Pred. No. 2.6; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 340 

I I I I I I II II II I I I I I I III I II III I III I 

Db 84 6 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 905 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 906 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 965 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 966 CCTCCTCTG 974 



RESULT 2 
US-07-688-352C-39 

; Sequence 39, Application US/07688352C 
; Patent No. 5527896 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 57 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O f Toole, Gerstein, Murray & 

ADDRESSEE : Bicknell 

STREET: Two First National Plaza, 20 South Clark 

STREET: Street 
; CITY: Chicago 

; STATE: Illinois 

COUNTRY: USA 

ZIP: 60603 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07 /688 , 352C 

FILING DATE: 19910419 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 

NAME: Borun, Michael F. 

REGISTRATION NUMBER: 25447 



REFERENCE/ DOCKET NUMBER: 27805/30197 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (312) 346-5750 
TELEFAX: (312) 984-9740 
TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1230 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME /KEY: CDS 
LOCATION: 3.. 1156 
US-07-688-352C-39 



Query Match 7.0%; Score 29.8; DB 1; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 2.6; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 1 1 1 I 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 


340 


Db 


847 


906 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


400 


Db 


907 


966 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 




Db 


967 


CCTCCTCTG 975 





RESULT 3 
US-08-474-379C-39 

; Sequence 39, Application US/08474379C 
; Patent No. 5977305 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: CLONING BY COMPLEMENTATION AND RELATED 
TITLE OF INVENTION: PROCESSES 
NUMBER OF SEQUENCES: 88 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 233 South Wacker Drive/6300 Sears Tower 

CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 
ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 



CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /4 74 , 37 9C 

FILING DATE: 07-JUN-1995 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
■ PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/206,188 

FILING DATE: 01-MAR-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/688,352 

FILING DATE: 19-APR-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36,107 

REFERENCE/DOCKET NUMBER: 27866/32771 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 474-6300 

TELEFAX: (312) 474-0448 
INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1230 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single , 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: CDS 

LOCATION: 3.. 1154 
US-08-474-379C-39 



Query Match 7.0%; Score 29.8; DB 2; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 2.6; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 847 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 906 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 907 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 966 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 967 CCTCCTCTG 975 



RESULT 4 
US-09-146-249A-39 

; Sequence 39, Application US/09146249A 
; Patent No. 6069240 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 



; TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 85 
CORRESPONDENCE ADDRESS : 
; ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 6300 Sears Tower, 233 South Wacker Drive 
CITY: Chicago 
STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/1 4 6, 24 9A 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36,107 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 312/474-6300 

TELEFAX: 312-474-0448 

TELEX: 25-3856 
INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1230 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 3.. 1156 
US-09-146-249A-39 



Query Match 7.0%; Score 29.8; DB 3; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 2.6; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 

Db 


281 
847 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 
CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 


34 
90 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II MM Mill 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


40 


Db 


907 


96 


Qy 


401 


cgttctgtg 409 

M 1 II 1 
CCTCCTCTG 97 5 




Db 


967 





RESULT 5 
US-08-206-188B-39 

; Sequence 39, Application US/08206188B 
; Patent No. 6100025 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 
APPLICANT: Colicelli, John J. 
; TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 84 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O f Toole, Gerstein, Murray & Borun 

STREET: 6300 Sears Tower, 233 South Wacker Drive 

CITY: Chicago 

STATE: Illinois 
; COUNTRY: United States of America 

ZIP : 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /206, 188B 

FILING DATE: 01-MAR-1994 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36107 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 312/474-6300 

TELEFAX: 312-474-0448 

TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 39: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1230 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: CDS 

LOCATION: 3.. 1156 
US-08-206-188B-39 



Query Match 7.0%; Score 29.8; DB 3; Length 1230; 

Best Local Similarity 51.9%; Pred. No. 2.6; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 
I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 



Db 847 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 906 



Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 907 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 966 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 967 CCTCCTCTG 975 



RESULT 6 
US-08-474-379C-87 

; Sequence 87, Application US/08474379C 
; Patent No. 5977305 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: CLONING BY COMPLEMENTATION AND RELATED 
TITLE OF INVENTION: PROCESSES 
NUMBER OF SEQUENCES : 88 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 233 South Wacker Drive/6300 Sears Tower 

CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /4 7 4 , 37 9C 

FILING DATE: 07-JUN-1995 

CLASSIFICATION: 4 35 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 
. FILING DATE: 20-APR-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/206,188 

FILING DATE: 01-MAR-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/688,352 

FILING DATE: 19-APR-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36,107 

REFERENCE/DOCKET NUMBER: 27866/32771 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 474-6300 

TELEFAX: (312) 474-0448 
; INFORMATION FOR SEQ ID NO: 87: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1481 base pairs 
; TYPE: nucleic acid 



STRANDEDNESS : single 
TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

FEATURE: 

NAME/KEY: CDS 
LOCATION: 1..1008 
US-08-474-379C-87 



Query Match 7.0%; Score 29.8-; DB 2; Length 1481; 

Best Local Similarity 51.9%; Pred. No. 2.8; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 
1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


340 


Db 


957 


CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 


1016 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


400 


Db 


1017 


1076 


Qy 


401 


cgttctgtg 409 
I 1 1 1 1 1 
CCTCCTCTG 1085 




Db 


1077 





RESULT 7 
US-07-688-352C-11 

; Sequence 11, Application US/07688352C 
; Patent No. 5527896 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 57 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & 

ADDRESSEE: Bicknell 

STREET: Two First National Plaza, 20 South Clark 

STREET: Street 
; CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP: 60603 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible " 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07 /688 , 352C 

FILING DATE: 19910419 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 



ATTORNEY /AGENT INFORMATION: 
; NAME: Borun, Michael F. 

REGISTRATION NUMBER: 25447 

REFERENCE/DOCKET NUMBER: 27805/30197 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 346-5750 

TELEFAX: (312) 984-9740 

TELEX: 25-3856 
INFORMATION FOR SEQ ID NO: 11: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2702 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME /KEY: CDS 

LOCATION: 2.. 2701 
US-07-688-352C-11 



Query Match 7.0%; Score 29.8; DB 1; Length 2702; 

Best Local Similarity 51.9%; Pred. No. 3.7; 

67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 2593 

ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 

I II I I I I I I I I I I I I I I I I I I I I I Mill 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 2 653 



I I I I I I 



Matches 


Qy 


281 


Db 


2534 


Qy 


341 


Db 


2594 


Qy 


401 


Db 


2654 



RESULT 8 
US-08-474-379C-11 

; Sequence 11, Application US/08474379C 
; Patent No. 5977305 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: CLONING BY COMPLEMENTATION AND RELATED 
TITLE OF INVENTION: PROCESSES 
NUMBER OF SEQUENCES: 88 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 
; STREET: 233 South Wacker Drive/6300 Sears Tower 

CITY: Chicago 
; STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 



COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /4 7 4 , 37 9C 
FILING DATE: 07-JUN-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 
FILING DATE: 20-APR-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/206,188 
FILING DATE: 01-MAR-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/688,352 
FILING DATE: 19-APR-1991 
ATTORNEY/AGENT INFORMATION: . 
NAME: Clough, David W. 
REGISTRATION NUMBER: 36,107 
REFERENCE/DOCKET NUMBER: 27866/32771 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (312) 474-6300 
TELEFAX: (312) 474-0448 
INFORMATION FOR SEQ ID NO: 11: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2702 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 
LOCATION: 8.. 27 01 
FEATURE : 

NAME /KEY: misc_f eature 
LOCATION: 2433 

OTHER INFORMATION: /note= "A shift in reading frame 
OTHER INFORMATION: may occur at this nucleotide." 
US-08-474-379C-11 



Query Match 7.0%; Score 29.8; DB 2; Length 2702; 

Best Local Similarity 51.9%; Pred. No. 3.7; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I I I I I I III I II I II I III I 

Db 2534 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 2593 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 400 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2594 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 2653 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 2654 CCTCCTCTG 2662 



RESULT 9 
US-09-146-249A-11 

Sequence 11, Application US/09146249A 
Patent No. 6069240 
GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 
APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 85 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 
STREET: 6300 Sears Tower, 233 South Wacker Drive 
CITY: Chicago 
STATE: Illinois 

COUNTRY: United States of America 
ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/0 9/14 6, 24 9A 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 
FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 
NAME: Cl'ough, David W. 
REGISTRATION NUMBER: 36,107 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 312/4 74-6300 
TELEFAX: 312-4 7 4-04 4 8 
TELEX: 25-3856 
INFORMATION FOR SEQ ID NO: 11: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 2702 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 
LOCATION: 8.. 27 01 
FEATURE: 

NAME/KEY : misc_f eature 
LOCATION: 2433 

OTHER INFORMATION: /note= "A shift in reading frame 
OTHER INFORMATION: may occur at this nucleotide." 
US-09-146-249A-11 



Query Match 7.0%; Score 29.8; DB 3; Length 2702; 

Best Local Similarity 51.9%; Pred. No. 3.7; 



Hatches 



67; Conservative 



0; Mismatches 62; Indels 



0; Gaps 



Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 
1 1 I 1 ! I 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


340 


UD 


Z 0 J H 


2593 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


400 


Db 


2594 


2653 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 




Db 


2654 


CCTCCTCTG 2662 





RESULT 10 
US-08-206-188B-11 

; Sequence 11, Application US/08206188B 
; Patent No. 6100025 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 84 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 6300 Sears Tower, 233 South Wacker Drive 

CITY: Chicago 
; STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/206, 188B 

FILING DATE: 01-MAR-1994 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36107 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 312/474-6300 

TELEFAX: 312-474-0448 

TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 11: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 2702 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS : single 

TOPOLOGY: linear 



MOLECULE TYPE: cDNA 
FEATURE : 

NAME/KEY: CDS 
LOCATION: 8.. 2701 
FEATURE: 

NAME /KEY: misc_f eature 
LOCATION: 2433 

OTHER INFORMATION: /note= "A shift in reading frame 
OTHER INFORMATION: may occur at this nucleotide." 
US-08-206-188B-11 



Query Match 7.0%; Score 29.8; DB 3; Length 2702; 

Best Local Similarity 51.9%; Pred. No. 3.7; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II I II I I I Ml I M IN I M I 

Db 2534 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 2593 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2594 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 2653 

Qy 401 cgttctgtg 409 

I I I I I I 
Db 2654 CCTCCTCTG 2662 



RESULT 11 
US-07-688-352C-21 

; Sequence 21, Application US/07688352C 
; Patent No. 5527896 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 57 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & 

ADDRESSEE: Bicknell 

STREET: Two First National Plaza, 20 South Clark 

STREET: Street 

CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP: 60603 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07 /688 , 352C 

FILING DATE: 19910419 

CLASSIFICATION: 435 



PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY /AGENT INFORMATION: 
; NAME: Borun, Michael F. 

REGISTRATION NUMBER: 25447 

REFERENCE/DOCKET NUMBER: 27805/30197 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 346-5750 

TELEFAX: (312) 984-9740 

TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 3131 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: misc_feature 

LOCATION: 1652 
/ OTHER INFORMATION: /note= "A shift in reading frame 

; OTHER INFORMATION: may occur at this residue." 

FEATURE: 

NAME /KEY: CDS 

LOCATION: j oin ( 7 4 3 . . 1 64 8 , 1651.. 2661) 
US-07-688-352C-21 



Query Match 7.0%; Score 29.8; DB 1; Length 3131; 

Best Local Similarity 51.9%; Pred. No. 4; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 
CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 


340 


Db 


2607 


2666 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


400 


Db 


2667 


2726 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 
CCTCCTCTG 2735 




Db 


2727 





RESULT 12 
US-09-146-249A-21 

; Sequence 21, Application US/09146249A 
; Patent No. 6069240 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 85 
CORRESPONDENCE ADDRESS: 



ADDRESSEE : Marshall, 0 ! Toole, Gerstein, Murray & Borun 
STREET: 6300 Sears Tower, 233 South Wacker Drive 
CITY: Chicago 
STATE : Illinois 

COUNTRY: United States of America 
ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/14 6, 24 9A 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 
FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 
NAME: Clough, David W. 
REGISTRATION NUMBER: 36,107 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 312/47 4-6300 
TELEFAX: 312-4 7 4-04 4 8 
TELEX: 25-3856 
INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 3131 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: misc_feature 

OTHER INFORMATION: /note= "Nucleotides 429-427 and 634-670 
OTHER INFORMATION: may represent introns; sequence may have frame shifts 
at nucleo 

OTHER INFORMATION: 592, 1590 and 1592." 
FEATURE: 

NAME/KEY: CDS 

LOCATION: join (2. .1648, 1651. .2661) 
US-09-146-249A-21 



Query Match 7.0%; Score 29.8; . DB 3; Length 3131; 

Best Local Similarity 51.9%; Pred. No. 4; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 340 

I I I I I I I I I I II I I 11 I I I I I I I I I I I I I I I I 

Db 2 607 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 2 666 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 667 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 272 6 



Qy 401 cgttctgtg 409 



Db 2727 CCTCCTCTG 2735 



RESULT 13 
US-08-206-188B-21 

; Sequence 21, Application US/08206188B 
; Patent No. 6100025 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 84 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 6300 Sears Tower, 233 South Wacker Drive 
; CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/206-, 188B 

FILING DATE: 01-MAR-1994 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36107 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 312/474-6300 

TELEFAX: 312-474-0448 

TELEX: 25-3856 
; INFORMATION FOR SEQ ID NO: 21: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 3131 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: misc_feature 

OTHER INFORMATION: /note= "Nucleotides 429-427 and 634-670 may 
OTHER INFORMATION: represent introns; sequence may have frame shifts at 
OTHER INFORMATION: nucleotides 328, 592, 1590 and 1592." 
FEATURE: 

NAME /KEY : CDS 

LOCATION: j oin ( 2 . . 1 64 8 , 1651.. 2661) 
US-08-206-188B-21 



Query Match 7.0%; Score 29.8; DB 3; Length 3131; 

Best Local Similarity 51.9%; Pred. No. 4; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

Mil I I I I I I II I I II I I I I I I I I I I I I I I I I 

Db 2 607 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 2 666 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I Mill I I I I I II II I I I I I I I I 
Db 2 667 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 272 6 

Qy 401 cgttctgtg 409 

I I II II 

Db 2727 CCTCCTCTG 2735 



RESULT 14 
PCT-US91-02714-20 

; Sequence 20, Application PC/TUS91027 14 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: Cloning by Complementation and Related 
TITLE OF INVENTION: Processes 
NUMBER OF SEQUENCES: 55 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & 

ADDRESSEE: Bicknell 

STREET: Two First National Plaza, 20 South Clark 

STREET: Street 

CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP: 60603 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US91/02714 

FILING DATE: 19910419 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
ATTORNEY/AGENT INFORMATION: 
; NAME: Borun, Michael F. 

REGISTRATION NUMBER: 25447 

REFERENCE/DOCKET NUMBER: 27805/30197 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 346-5750 

TELEFAX: (312) 984-9740 

TELEX: 25-3856 
INFORMATION FOR SEQ ID NO: 20: 



SEQUENCE CHARACTERISTICS: 

LENGTH: 3131 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 7 4 3.. 1651 
FEATURE : 

NAME/KEY: misc_f eature 

LOCATION: 1652 

OTHER INFORMATION: /note= "A shift in reading frame 
OTHER INFORMATION: may occur at this residue." 
PCT-US91-02714-20 



Query Match 7.0%; Score 29.8; DB 5; Length 3131; 

Best Local Similarity 51.9%; Pred. No. 4; 



Matches 


67; Conservative 0; Mismatches 62; Indels 0; Gaps 


Qy 


281 


cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 

1 1 1 1 1 1 1 1 1 1 1 1 Mill] 1 II 1 1 1 1 1 1 1 II 1 1 
CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 


340 


Db 


2607 


2666 


Qy 


341 


ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 

1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M Ml 
AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 


400 


Db 


2667 


2726 


Qy 


401 


cgttctgtg 409 
1 1 1 1 1 1 




Db 


2727 


CCTCCTCTG 2735 





RESULT 15 
US-08-474-379C-64 

; Sequence 64, Application US/08474379C 
; Patent No. 5977305 
; GENERAL INFORMATION: 

APPLICANT: Wigler, Michael H. 

APPLICANT: Colicelli, John J. 

TITLE OF INVENTION: CLONING BY COMPLEMENTATION AND RELATED 
TITLE OF INVENTION: PROCESSES 
NUMBER OF SEQUENCES: 88 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Marshall, O'Toole, Gerstein, Murray & Borun 

STREET: 233 South Wacker Drive/6300 Sears Tower 

CITY: Chicago 

STATE: Illinois 

COUNTRY: United States of America 

ZIP: 60606-6402 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/08 /4 74 , 37 9C 

FILING DATE : 07-JUN-1995 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/511,715 

FILING DATE: 20-APR-1990 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/206,188 

FILING DATE: 01-MAR-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/688,352 

FILING DATE: 19-APR-1991 
ATTORNEY /AGENT INFORMATION: 

NAME: Clough, David W. 

REGISTRATION NUMBER: 36,107 

REFERENCE/DOCKET NUMBER: 27866/32771 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (312) 474-6300 

TELEFAX: (312) 474-0448 
; INFORMATION FOR SEQ ID NO: 64: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 3705 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE: 

NAME/KEY: CDS 

LOCATION: 116.. 2773 
US-08-474-379C-64 



Query Match 7.0%; Score 29.8; DB 2; Length 3705; 

Best Local Similarity . 51.9%; Pred. No. 4.3; 

Matches 67; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 281 cacactgccgctcccccacgctaaatttgggggctacagtgcacacgctagccgattaac 34 0 

I I I I I I I I I I II Mill I I I I I I I I I I I I I I I 

Db 2722 CACATCCGCACTCCCAGCTCCTGGTGGCGGGGGGTCAGGTGGAGACCCTACCTGATCCCC 27 81 

Qy 341 ggctcacgctaccaggcgctctacgcggatgtgccccctagccagcttctctctccccct 4 00 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 27 82 AGACCTCTGTCCCTGTTCCCCTCCACTCCTCCCCTCACTCCCCTGCTCCCCCGACCACCT 2841 

Qy 401 cgttctgtg 409 

I E II II 
Db 2842 CCTCCTCTG 2850 



Search completed: February 7, 2002, 10:51:52 
Job time: 6078 sec 



GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



February 7, 2002, 08:20:45 ; Search time 4942.22 Seconds 

{without alignments ) 
924.070 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-394-745-6332 
425 

1 cggacgcgtgggtgcaattt tgtggtgcctctctcaacct 425 



Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



11351937 seqs, 5372889281 residues 



Total number of hits satisfying chosen parameters: 



22703874 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : EST:* 

1 : em_estf un : * 

2: em_esthum: * 

3: em_estin:* 

4 : em_estom: * 

5: em__estpl:* 

6: em_estba:* 

7: em_estro:* 

8: em_estov:* 

9: em_htc:* 
10: gb_estl:* 
11:" gb_est2:* 
12: gbjitc:* 
13: gb_gss:* 
1 4 : em_gs s_f un : * 
15: em_gss_hum:* 
16: em_gss_inv:* 
17: em_gss_pln:* 
18: em_gss_pro : * 
19: em_gss_rod:* 
20: em_gss_vrt:* 
21: em_gss_other : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 
BG837751 ZmlO 05b0 



409.8 
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ALIGNMENTS 



RESULT 1 
BG837751/C 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 



BG837751 738 bp mRNA EST 25-MAY-2001 

Zml0_05b06_A Zml0_AAFC_ECORC_Fusarium_graminearum_corn_silk Zea 
mays cDNA clone Zml0_05b06, mRNA sequence. 
BG837751 

BG837751.1 GI:14204074 
EST. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE ■ 

JOURNAL 
COMMENT 



FEATURES 

source 



Zea mays. 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 738) 

Harris, L. J., Balcerzak, M . , Allard,S., Saparno,A., Couroux,P., De 
Moors, A., Hattori, J. I . , Ouellet,T., Robert, L.S., Singh,-J.A, Sprott 
, D. and Tinker, N. A. 

Expressed Sequence Tags from Maize Silk Six Hours After Silk 
Channel Inoculation with Fusarium graminearum 
Unpublished (2001) 
Contact: Harris, Linda J. 

Eastern Cereal and Oilseed Research Centre 
Agriculture and Agri-food Canada 

Bldg. 21, Central Experimental Farm, Ottawa, Ontario, K1A 0C6, 
CANADA 

Tel: (613) 759-1314 " 
Fax: (613) 759-6566 
Email: harrisl j @em . agr . ca . 

Location/Qualifiers 

1. .738 

/organism="Zea mays" 
/cultivar="C0388" 
/db_xref="taxon: 4577" 
/clone="ZmlOJ)5b06" 

/clone_lib="ZmlO_AAFC_ECORC_Fusarium_graminearum_corn__silk 



BASE COUNT 
ORIGIN 



/tissue_type="Silk" 

/dev_stage="4-5 days post-silk emergence" 
/note="Vector: Bluescript SK+/XhoI-EcoRI ; Site_l: EcoRI; 
Site__2: Xhol; Field-grown corn was silk channel-inoculated 
in the morning (-10 am) with 1 ml of a macroconidial 
suspension (500,000 spores/ml) of Fusarium graminearum and 
silk channels were collected and immediately frozen in 
liquid nitrogen 6 hours later. RNA was extracted from 
silk tissue between 1 cm below and above the inoculation 
point in the silk channel, RNA from five silk channels was 
pooled. " 

187 a 207 c 161 g 183 t 



Query Match 96.4%; Score 409.8; DB 11; Length 738; 

Best Local Similarity 99.5%; Pred. No. 1.2e-112; 



Matches 


411; Conservative 0; Mismatches 2; Indels 0; Gaps 


Qy 


13 


tgcaatttgaggagagagacgagatcatgaggaagcaatactcccctgtgctctacttct 


72 






1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I ! 1 1 1 1 1 1 1 1 1 M 




Db 


725 


TGCAATTTGAGGAGAGAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTACTTCT 


666 


Qy 


73 


gcctgatggcccttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaag 


132 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


665 


GCCTGATGGCCCTTGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAG 


606 



Qy 133 caggaaggagtggctacaactcgtacgaacctgatggaaggggtggatacaactctgttc 192 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



UD 


DUD 


parr anrr ACiTdciCTTxc a a PTrnTarn a ArrT^ATHft a An^nnT^n AT Ar aactctgttc 


546 


Qy 


193 


ccatcaacggcggtggcagcccctagctaggcggtggatccgagcctgtatcagaaatcg 


252 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I'l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 






4 86 


Qy 


253 


aaataatataagactgtcttcaacggatcacactgccgctcccccacgctaaatttgggg 


312 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


A Q C. 
4 0 3 


TinnTTiTiTaTn arapTrTTTTra z^ctzc a tc at ncvdccczc^cc ccc accept A A ATTTnnfifi 


426 


Qy 




get acagtgcacacgct ageega tt aaegget cacget accaggcgct ct aegeggatgt 








1 1 1 1 1 1 ! I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II i 1 1 II 1 1 II 1 II 1 1 1 i i 1 II 1 1 1 1 1 1 1 1 I 1 




Db 


425 


GCTACAGTGCACACGCTAGCCGATCAACGGCTCACGCTACCAGGCGCTCTACGCGGATGT 


366 


Qy 


373 


gccccctagccagcttctctctccccctcgttctgtggtgcctctctcaacct 425 








1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


365 


GCCCCCTAGCCAGCTTCTCTCTCCCCCTCGTTCTGTGGTGCCTCTCTCACCCT 313 





RESULT 2 
BG349846/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BG349846 330 bp . mRNA EST 01-MAR-2001 

947031G10.x3 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 
sequence . 
BG349846 

BG349846.1 GI:13178588 
EST. 

Zea mays. 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 330) 
Walbot,V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email : walbot@stanf ord . edu 
Plate: 947031 row: G column: 10. 
Location/Qualifiers 
1. .330 

/organism="Zea mays" 
/cultivar="B73" 
/db_xref="taxon: 4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves)" 
/lab_host="XLl-Blue" 

/note="Organ: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2: Xhol; Directionally cloned using 
Stratagene f s UniZap XR cDNA cloning kit with the 5 f end 



at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

BASE COUNT 83 a 93 c 69 g 85 t 

ORIGIN 



Query Match 4 8.4%; 

Best Local Similarity 94.2%; 
Matches 226; Conservative 



Score 205.6; DB 11; 
Pred. No. 2.6e-51; 
0; Mismatches 9; 



Length 330; 
Indels 5; Gaps 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 



2 4 gagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcc 83 

I I II I I I I I I I I I I I I I I I I I I 1 I I I I M I I I I I I I I II I I I III I I I I I II I I I I I 
324 GGGAGAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCC 2 65 



84 



264 



cttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagt 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 
CTTGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGT 



143 



205 



144 ggctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggc 203 

I I I I I I I I 1 I I I II I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I II I I I I I I 
204 GGCTACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGC 14 5 

204 ggtggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 

I I I I I I I I I I I I I I I I I i i I I I I I I I I I I I 1 M I I i I I I I I I III I I I I I M 

14 4 GGTGGCAGCCCCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 90 



l; 



RESULT 3 
BG349673/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BG349673 332 bp mRNA EST 01-MAR-2001 

947031G10.xl 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 
sequence . 
BG349673 

BG349673.1 GI:13178400 
EST. 

Zea mays. 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Pariicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 332) 
Walbot,V. i 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email: walbot@stanford.edu 

Plate: 947031 row: G column: 10. 

Location/Qualifiers 

1. .332 

/organism="Zea mays" 



/cultivar="B73" 
/db_xref="taxon:4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev_stage-"2 week old seedling (3 leaves)" 
/lab_host="XLl-Blue" 

/note="Organ: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2: Xhol; Directionally cloned using 
Stratagene's UniZap XR cDNA cloning kit with the 5' end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

BASE COUNT 84 a 94 c 69 g 85 t 

ORIGIN 



Query Match 48.4%; Score 205.6; DB 11; Length 332; 

Best Local Similarity 94.2%; Pred. No. 2.6e-51; 

Matches 226; Conservative 0; Mismatches 9; Indels 5; Gaps 1; 

Qy 24 gagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcc 83 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II III I I I I I I I M I II 
Db 32 6 GGGAGAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCC 2 67 

Qy 84 cttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagt 14 3 

I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I II II I I I M I I M I I I I I I I I I I 

Db 2 66 CTTGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGT 207 

Qy 14 4 ggctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggc 203 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I I I I I I 

Db - 206 GGCTACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGC 14 7 

Qy 204 ggtggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 

I I I I I I I I IS I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I 

Db 14 6 GGTGGCAGCCCCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 92 



RESULT 4 
BG349674/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



BG349674 337 bp mRNA EST 01-MAR-2001 

947031G10.x2 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 
sequence. 
BG349674 

BG349674.1 GI:13178401 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta ; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 337) 
Walbot,V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 



FEATURES 

source 



CA 94304, USA 



10. 



BASE COUNT 
ORIGIN 



Stanford University 
855 California Ave, Palo Alto, 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email : walbot@stanford.edu 
Plate: 947031 row: G column: 
Location/Qualifiers 
1. .337 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref="taxon:4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev__stage="2 week old seedling {3 leaves)" 
/lab_host="XLl-Blue" 

/note="0rgan: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2: Xhol; Directionally cloned using 
Stratagene f s UniZap XR cDNA cloning kit with the 5 1 end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

84 a 95 c 70 g 88 t 



Query Match 48.4%; 
Best Local Similarity 94.2%; 
Matches 226; Conservative 



Score 205. 6; , DB 11; 
Pred. No. 2.6e-51; 
0; Mismatches 9; 



Length 337; 
Indels 5; 



Gaps 



1; 



Qy 



Db 



24 



331 



gagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcc 83 

I I I 1 I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I 1 I I I I I I III I I I I I I I I I I I I 
GGGAGAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCC 27 2 



Qy 

Db 

Qy 

Db 

Qy 
Db 



84 cttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagt 14 3 

I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I 
271 CTTGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGT 212 



144 



203 



ggctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggc 

I I I I I I I I I I II I I I I I I I I I II I I I I I II I I I I I I I I I II II I I I I I I II I I II I I 
211 GGCTACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGC 152 

204 ggtggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 

I I II I I I I I I I I 1 I I I I I II I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I 

151 GGTGGCAGCCCCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 97 



RESULT 5 

BG550269 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BG550269 
947039B02. 
sequence . 
BG550269 
BG550269. 1 
EST. 

Zea mays. 
Zea mays 
Eukaryota; 



yi 



427 bp mRNA EST 05-APR-2001 

947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 



GI:13558914 



Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 



Spermatophyta; Magnoliophyta; Liliopsicla; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
REFERENCE 1 (bases 1 to 427) 
AUTHORS Walbot, V. 

TITLE Maize ESTs from various cDNA libraries sequenced at Stanford 

University 
JOURNAL Unpublished (1999) 
• COMMENT Contact: Walbot V 

Department of Biological Sciences 

Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email : walbot@stanford.edu 
Plate: 947039 row: B column: 02. 
FEATURES Location/Qualifiers 
source 1. .427 

/organism= !, Zea mays" 

/cultivar="B73" 

/db_xref="taxon: 4 577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type=" leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves)" 
/lab_host="XLl-Blue" 

/note="Organ: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2 : Xhol; Directionally cloned using 
Stratagene's UniZap XR cDNA cloning kit with the 5 ! end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

BASE COUNT 126 a * 80 c 104 g 117 t 

ORIGIN 



Query Match 48.3%; Score 205.2; DB 11; Length 427; 

Best Local Similarity 94.5%; Pred. No. 3.8e-51; 



Matches 


225; Conservative 0; Mismatches 8; Indels 5; Gaps 


Qy 


26 


gagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggccct 


85 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Ml i 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


2 


GAGAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCT 


61 


Qy 


86 


tgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtgg 


145 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


62 


TGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGG 


121 


Qy 


146 


ctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcgg 


205 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


122 


CTACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGG 


181 


Qy 


206 


tggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 263 






1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 




Db 


182 


TGGCAGCCCCT AGCCAGGCGGTG GAGCCTGT ATCAGAAATCAAAAAAATAT AA 234 



RESULT 
AW287785 



6 



LOCUS AW287785 473 bp mRNA EST 09-FEB-2000 

DEFINITION 829008D03.xl 829 - Silk infected with Fusarium Zea mays cDNA, mRNA 

sequence . 
ACCESSION AW287785 
VERSION AW287785.1 GI: 6681798 

KEYWORDS EST . 
SOURCE Zea mays. 

ORGANISM Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 

Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 

clade; Panicoideae; Andropogoneae ; Zea. 
REFERENCE 1 (bases 1 to 473) 
AUTHORS Walbot,V. 

TITLE Maize ESTs from various cDNA libraries sequenced at Stanford 

University 
JOURNAL Unpublished (1999) 
COMMENT Contact: Walbot V 

Department of Biological Sciences 

Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email: walbot@stanford.edu 
Plate: 829008 row: D column: 03. 
FEATURES Location/Qualifiers 
source 1. .473 

/organism="Zea mays" 

/cultivar="B73" 

/db_xre f =" t axon : 4 57 7 " 

/clone_lib="829 - Silk infected with Fusarium" 

/tissue_type=="silk" 

/dev_stage=" adult" 

/lab_host="DH10B" 

/note="0rgan: silk; Vector: pBluescript II XR; Site_l : 
Xhol; Site_2: EcoRI; cDNA library of silks infected with 1 
microliter of 500,000 spores/ml solution of Fusarium 
graminearum DAOM 180378. Prepared by Sharon Allard of 
Eastern Cereal and Oilseed Research Centre, Agriculture 
and Agri-Food Canada using Stratagene cDNA synthesis kit. 
Silk was harvested at 72 hours p.i." 

BASE COUNT 148 a 81 c 143 g 101 t 

ORIGIN 



Query Match 47.2%; Score 200.8; DB 10; Length 473; 

Best Local Similarity 92.9%; Pred. No. 8.3e-50; 



Matches 


223; Conservative 0; Mismatches 12; Indels 5; Gaps 


Qy 


24 


gagagagacgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcc 


83 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


81 


GAGAGAGAGAGAATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCC 


140 


Qy 


84 


cttgtcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagt 


14 3 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II M 1 1 




Db 


141 


CTTGTCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGT 


200 


Qy 


144 


ggctacaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggc 


203 



Db 

Qy 

Db 



201 



204 



261 



I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
GGCTACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGC 



260 



263 



ggtggcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 
I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I II I I I I II I I II I I I I I I I 

GGTGGCAGCCCCTAGCCAGGCGGTG G AGC C T GT ATC AG AAAT C AAAAAAAT AT AA 315 



RESULT 7 
BG360891/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



. FEATURES 

source 



BASE COUNT 
ORIGIN 



BG360891 296 bp mRNA EST 08-MAR-2001 

947043D12.x2 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 
sequence . 
BG360891 

BG360891.1 GI:13249988 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; - Zea . 
1 (bases 1 to 296) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email: walbot@stanford.edu 

Plate: 947043 row: D column: 12. 

Location/Qualifiers 

1. .296 

/organism="Zea mays" 

/cultivar= n B73" 

/db_xref="taxon:4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/t issue_type="leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves)" 
/lab_host="XLl-Blue" 

/note="0rgan: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2: Xhol; Directionally cloned using 
Stratagene's UniZap XR cDNA cloning kit with the 5 f end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

72 a 81 c 66 g 77 t 



Query Match 47.1%; Score 200," DB 11; Length 296; 

Best Local Similarity 93.6%; Pred. No. 1.2e-49; 

Matches 221; Conservative 0; Mismatches 10; Indels 5; Gaps 1; 



Qy 


28 


Db 


292 


Qy 


88 


Db 


232 


Qy 


148 


Db 


172 


Qy 


208 


Db 


112 



gagacgagat cat gaggaagcaa tact cccctgtgctctacttctgcctgatggcccttg 87 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I i I III I I I II I I I I I I I I I I I 
GAGACGAGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTG 233 

tcgtagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggct 147 

I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I 
TCGTAGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCT 173 

acaactcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtg 207 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M 
ACAACTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTACAATAACCGGCGGTG 113 

gcagcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 
I I I I I I I I I I I I I I I I I I I I I I I II I II I I I II I I I I I III I I II I II 

GCAGCCCCTAGCCAGGCGGTG G AG CC T GT AT C AG AAAT C AAAAAAAT AT AA 62 



RESULT 8 
BG840972/c 

LOCUS BG840972 372 bp mRNA EST 29-MAY-2001 

DEFINITION MEST14-B02 ,T3 ISUM4-TN Zea mays cDNA clone MEST14-B02 3 1 , mRNA 

sequence . 
ACCESSION BG840972 

VERSION BG840972.2 GI:14243334 

KEYWORDS EST. 
SOURCE Zea mays. 

ORGANISM Zea mays 

Eukaryota ; Vir idiplantae ; Streptophyta ; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
REFERENCE 1 (bases 1 to 372) 

AUTHORS Qiu,F., Cui,F., Guo,L., Ashlock,D.A, Wen, T.J. and Schnable , P . S . 
TITLE Expressed Sequence Tags from B73 Maize Seedlings and Silks 

JOURNAL Unpublished (2001) 
COMMENT On May 25, 2001 this sequence version replaced gi: 14207294. 

Contact: Patrick S. Schnable 
Schnable Laboratory 
Iowa State University 

G405 Agronomy, Iowa State University, Ames, IA 50011-1010, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email: schnable@iastate.edu 

PCR PRimers 

FORWARD: T7-1 (AA TAC GAC TCA CTA TAG) 
BACKWARD: T3 (ATT AAC CCT CAC TAA AG) 
Seq primer: primer T3 (ATT AAC CCT CAC TAA AG). 
FEATURES Location/Qualifiers 
source 1. .372 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref="taxon: 4577" 

/clone="MEST14-B02" 

/clone_lib="ISUM4-TN" 

/tissue_type="Seedling and silk" 

/lab_host="DH10B" 

/note="Vector : pT7T3PAC; Site_l: EcoRI; Site_2: NotI; 
ds-cDNA molecules were generated as follows. First-strand 



cDNA was prepared from oligo-dT selected mRNA by priming 
with a NotI oligo-dT primer (5 1 

AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT) . The 
resulting DNA : RNA hybrid was treated with RNase H and used 
as a template for DNA Poll-catalyzed second strand 
synthesis. After the addition of EcoRI adaptors, the 
ds-cDNAs were digested with NotI and size-selected. The 
resulting molecules were directionally cloned into the 
EcoRI and NotI sites of the pT7T3PAC vector. The library 
then went through one round of normalization to CoT value 
of 5 based on the methods of Marcelo Bento Soares (Genome 
Research 6: 791-806, 1996)." 

BASE COUNT 93 a 94 c 75 g 110 t 

ORIGIN 



Query Match 46.4%; Score 197.2; DB 11; Length 372; 

Best Local Similarity 94.3%; Pred. No. 9.2e-49; 

Matches 217; Conservative 0; Mismatches 8; Indels 5; Gaps 1; 

Qy 34 agatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtag 93 

I I I I I 1 I I I I I I I 1 I I I I I I I I I I I 1 I I I 1 I I I III I I I I i I I I I I I I I I I I I I I I II 
Db 372 AGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTGTCGTAG 313 

Qy 94 ctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaact 153 

I I I I I I I I I I I I I I I I I I I I I I II I i I I I I I I t I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 312 CTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAACT 253 

Qy 154 cgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcc 213 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I II ! I I I I I I I II I I I I I I 
Db 252 CGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCC 193 

Qy 214 cctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 

I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I III I I I I I I I 

Db 192 CCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 14 8 



RESULT 9 
BG842137/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BG842137 410 bp mRNA EST 29-MAY-2001 

MEST36-F07 . T3 ISUM3-TL Zea mays cDNA clone MEST36-F07 3', mRNA 
sequence . 
BG842137 

BG842137.1 GI:14208459 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 410) 

Qiu, F. , Cui,F., Guo,.L., Ashlock,D.A, Wen,T.J. and Schnable, P . S . 

Expressed Sequence Tags from B73 Maize Seedlings and Silks 

Unpublished (2001) 

Contact: Patrick S. Schnable 

Schnable Laboratory 

Iowa State University 



FEATURES 

source 



BASE COUNT 
ORIGIN 



G405 Agronomy, Iowa State University, Ames, IA 50011-1010, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email : schnable@iastate . edu 

PCR PRimers 

FORWARD: T7-1 (AA TAC GAC TCA CTA TAG) 

BACKWARD: T3 (ATT AAC CCT CAC TAA AG) 

Seq primer: primer T3 (ATT AAC CCT CAC TAA AG). 

Location/Qualifiers 

1. .410 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref="taxon:4577" 

/clone="MEST36-F07" 

/clone_lib="ISUM3-TL" 

/tissue_type="Seedling and silk" 

/lab_host="DH10B" 

/note-"Vector : pT7T3PAC; Site_l: EcoRI; Site_2: NotI; 
ds-cDNA molecules were generated as follows. First-strand 
cDNA was prepared from oligo-dT selected mRNA by priming 
with a NotI oligo-dT primer (5 f 

AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT) . The 
resulting DNA : RNA hybrid was treated with RNase H and use 
as a template for DNA Poll-catalyzed second strand 
synthesis. After the addition of EcoRI adaptors, the 
ds-cDNAs were digested with NotI and size-selected. The 
resulting molecules were directionally cloned into the 
EcoRI and NotI sites of the pT7T3PAC vector." 
114 a 99 c 79 g 118 t 



Query Match 46. 4%; 

Best Local Similarity 94.3%; 
Matches 217; Conservative 



Score 197.2; DB 11; 
Pred. No. 9.5e-49; 
0; Mismatches 8; 



Length 410; 
Indels 5; Gaps 



Qy 34 agatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtag 93 

I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I 
Db 410 AGATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTGTCGTAG 351 



Qy 

Db 

Qy 

Db 

Qy 

Db 



94 



350 



154 



290 



214 



230 



ctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaact 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
CTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAACT 



153 



291 



213 



cgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcc 

I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I II I I I I I I II 
CGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCC 231 

cctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I III I I I I I I I 

CCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 18 6 



RESULT 10 
BG874098/C 

LOCUS BG874098 379 bp mRNA EST 29-MAY-2001 

DEFINITION MEST46-C08.T3 ISUM4-TN Zea mays cDNA clone MEST46-C08 3', mRNA 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 
ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



sequence . 
BG874098 

BG874098.1 GI:14245516 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 379) 

Qiu,F., Cui,F., Guo,L., Ashlock,D.A, Wen, T.J. and Schnable, P . S . 

Expressed Sequence Tags from B73 Maize Seedlings and Silks 

Unpublished (2001) 

Contact: Patrick S. Schnable 

Schnable Laboratory 

Iowa State University 

G405 Agronomy, Iowa State University, Ames, IA 50011-1010, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email: schnable@iastate.edu 

PCR PRimers 

FORWARD: T7-1 (AA TAC GAC TCA CTA TAG) 

BACKWARD: T3 (ATT AAC CCT CAC TAA AG) 

Seq primer: primer T3 (ATT AAC CCT CAC TAA AG) . 

Location/Qualifiers 

1. .379 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref-"taxon:4577" 

/clone="MEST46-C08" 

/clone_lib="ISUM4-TN" 

/tissue_type="Seedling and silk" 

/lab_host="DH10B" 

/note-"Vector: pT7T3PAC; Site__l : EcoRI; Site_2: NotI; 
ds-cDNA molecules were generated as follows. First-strand 
cDNA was prepared from oligo-dT selected mRNA by priming 
with a NotI oligo-dT primer (5* 

AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT ) . The 
resulting DNA : RNA hybrid was treated with RNase H and used 
as a template for DNA Poll-catalyzed second strand 
synthesis. After the addition of EcoRI adaptors, the 
ds-cDNAs were digested with NotI and size-selected. The 
resulting molecules were directionally cloned into the 
EcoRI and NotI sites of the pT7T3PAC vector. The library 
then went through one round of normalization to CoT value 
of 5 based on the methods of Marcelo Bento Soares (Genome 
Research 6: 791-806, 1996)." 
90 a 94 c 74 g 121 t 



Query Match 46.4%; Score 197; DB 11; Length 379; 

Best Local Similarity 93.6%; Pred. No. l.le-48; 

Matches 218; Conservative 0; Mismatches 10; Indels 5; Gaps 1; 



Qy 



31 acgagatcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcg 90 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I III I II I I I I I I I I I I I I I I I I 



UD 


"37Q 


Ar^AnATCAT^AGC^AAQCAATACTCCCCTC^T^rTCTrCTTC^TC^CCTQATGGCCCTTGTCG 


320 


Qy 


91 


tagctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctaca 


150 






1 1 1 1 1 M 1 1 1 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




vu 




t anrTr:nT atpptptptptp atptap app apptppppap a app app a appaptpppt aca 


u w 


Qy 


151 


actcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggca 


Z I U 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


259 


ACTCGTACGAACCTGATGGAAGGAGTGGATACAACTTTGTTCCAATAAACGGCGGTGGCA 


200 


Qy 


211 


gcccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 








1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III I 1 1 1 1 II 




Db 


199 


GCCCCAAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 152 





RESULT 11 
BG840656 

LOCUS BG840656 371 bp mRNA EST 29-MAY-2001 

DEFINITION MEST14-B02 .T7-1 ISUM4-TN Zea mays cDNA clone MEST14-B02 5', mRNA 

sequence . 
ACCESSION BG840656 

VERSION BG840656.2 GI:14242839 

KEYWORDS EST. 
SOURCE Zea mays. 

ORGANISM Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
REFERENCE 1 (bases 1 to 371) 

AUTHORS Qiu, F. , Cui,F., Guo,L., Ashlock,D.A, Wen, T.J. and Schnable, P . S . 
TITLE Expressed Sequence Tags from B73 Maize Seedlings and Silks 

JOURNAL Unpublished (2001) 
COMMENT On May 25, 2001 this sequence version replaced gi: 14206978. 

Contact: Patrick S. Schnable 
Schnable Laboratory 
Iowa State . University 

G405 Agronomy, Iowa State University, Ames, IA 50011-1010, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email: schnable@iastate.edu 

PCR PRimers 

FORWARD: T7-1 (AA TAC GAC TCA CTA TAG) 
BACKWARD: T3 (ATT AAC CCT CAC TAA AG) 
Seq primer: primer T7-1 (AA TAC GAC TCA CTA TAG) . 
FEATURES Location/Qualifiers 
source 1. .371 

/organism=" Zea mays" 

/cultivar="B73" 

/db_xref="taxon:4577" 

/clone="MEST14-B02" 

/clone_lib="ISUM4-TN" 

/tissue_type="Seedling and silk" 

/lab_host="DH10B" 

/note="Vector: pT7T3PAC; Site_l: EcoRI; Site_2: NotI; 
ds-cDNA molecules were generated as follows. First-strand 
cDNA was prepared from oligo-dT selected mRNA by priming 
with a NotI oligo-dT primer (5' 



AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT) . The 
resulting DN A : RNA hybrid was treated with RNase H and used 
as a template for DNA Poll-catalyzed second strand 
synthesis. After the addition of EcoRI adaptors, the 
ds-cDNAs were digested with NotI and size-selected. The 
resulting molecules were directionally cloned into the 
EcoRI and NotI sites of the pT7T3PAC vector. The library 
then went through one round of normalization to CoT value 
of 5 based on the methods of Marcelo Bento Soares (Genome 
Research 6: 791-806, 1996)." 

BASE COUNT 110 a 75 c 93 g 93 t 

ORIGIN 



Query Match 44.0%; Score 187.2; DB 11; Length 371; 

Best Local Similarity 94.1%; Pred. No. 9.4e-46; 

Matches 207; Conservative 0; Mismatches 8; Indels 5; Gaps 1; 

Qy 4 4 gaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtagctgctatggt 103 

I I I I I I I I I I I I I II 1 I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 10 GAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTGTCGTAGCTGCTATGGT 69 

Qy 104 ctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaactcgtacgaacc 163 

I I I I t I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 0 CTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAACTCGTACGAACC 12 9 

Qy 164 tgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccctagctagg 223 

I I I I I I I I I I I I I I M I I I I I I I II I I I I II I I I I I I I I I I I I I II I I I I I I I. Ill 
Db 130 TGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCCCCTAGCCAGG 18 9 

Qy 224 cggtggatccgagcctgtatcagaaatcgaaataatataa 263 

I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I 

Db ,190 CGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 224 



RESULT 12 

BG349675 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



BG349675 208 bp mRNA EST 01-MAR-2001 

947031G10.yl 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 
sequence . 
BG349675 

BG349675.1 GI:13178402 
EST. 

Zea mays. 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 208) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Tel: 650 723 2227 

Fax: 650 725 8221 

Email: walbot@stanford.edu 

Plate: 947031 row: G column: 10. 

Location/Qualifiers 

1. .208 

/organism="Zea mays" 
/cultivar="B73" 
/db_xref="taxon: 4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves)" 
/lab_host-"XLl-Blue" 

/note="0rgan: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2 : Xhol; Directionally cloned using 
Stratagene's UniZap XR cDNA cloning kit with the 5 1 end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

49 a 55 c 58 g 46 t 



Query Match 43.7%; Score 185.8; DB 11; Length 208; 

Best Local Similarity 96.4%; Pred. No. 2e-45; 

Matches 190; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

Qy 37 tcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtagctg 96 

I I I I I I I I I I I I 1 I I I I I I I I 1 I I 1 I I I I I III I I I I I II I I I I I I I I I I I I I I I I I I 
Db 1 TCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTGTCGTAGCTG 60 

Qy 97 ctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaactcgt 156 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 CTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAACTCGT 120 

Qy 157 acgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccct 216 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I 1 1 I I I I I I I I I I I I I 
Db 121 ACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCCCCT 180 

Qy 217 agctaggcggtggatcc 233 

III I I I I I II I I I II 
Db 181 AGCCAGGCGGTGGAGCC 197 



RESULT 13 

BG349676 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



351 bp . mRNA EST 01-MAR-2001 

947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 



GI: 13178403 



BG349676 
947031G10.y2 
sequence . 
BG349676 
BG349676. 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



1 (bases 1 to 351) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 

University 

Unpublished (1999) 

Contact: Walbot V 

Department of Biological Sciences 

Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email : walbot@stanf ord . edu 

Plate: 947031 row: G column: 10. 

Location/Qualif iers 

1. .351 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref="taxon:4577" 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves) " 
/lab_host="XLl-Blue" 

/note="Organ: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2 : Xhol; Directionally cloned using 
Stratagene's UniZap XR cDNA cloning kit with the 5 f end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown . " 

97 a 73 c 92 g 89 t 



Query Match 42.9%; Score 182.2; DB 11; Length 351; 

Best Local Similarity 93.8%; Pred. No. 3e-44; 

Matches 213; Conservative 0; Mismatches 8; Indels 6; Gaps 2; 

Qy 38 catgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtagctgc 97 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 CATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGGCCCTTGTCGTAGCTGC 60 

Qy 98 tatggtctgtgtcatgtacaccacctcggcacaagca-ggaaggagtggctacaactcgt 156 

I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TATGGTCTGTGTCATGTACACCACCTCGGCACAAGCACGGAAGGAGTGGCTACAACTCGT 120 

Qy 157 acgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccct 216 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 121 ACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCCCCT 180 

Qy 217 agctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 263 

III I II I I I I I I I I I I I I I I I I I I I I II I III I I I I I I I 

Db 181 AGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 222 



RESULT 14 
AA072465 

LOCUS AA072465 216 bp mRNA EST 02-OCT-1996 

DEFINITION zEST00696 Maize Leaf, Stratagene #937005 Zea mays cDNA clone 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



end, mRNA sequence. 
GI:1590803 



BASE COUNT 
ORIGIN 



csuh00696 5' 
AA072465 
AA072465. 
EST. 

Zea mays. 
Zea mays 

Eukaryota ; Viridiplantae ; Streptophyt a ; Embryophyta ; Tracheophyta ; 

Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 

clade; Panicoideae; Andropogoneae; Zea. 

1 (bases 1 to 216) 

Baysdorfer, C. 

The Maize cDNA Program 

Unpublished (1993) 

Contact: Baysdorfer C 

California State University 

Dept Biol Sci, California State Univ, Hayward, CA 94542 
Tel: 5108853459 
Fax: 5108854747 

Email : cbaysdor@haywire . csuhayward . edu 
Seq primer: SK. 

Location/Qualifiers 

1. .216 

/organism="Zea mays" 
/strain="B73 ,f 
/db_xref="taxon: 4577" 
/clone="csuh00696" 

/clone_lib="Maize Leaf, Stratagene #937005" 
/note="Vector : Uni-ZAP; Site_l: EcoRl; Site_2: Xhol; mRNA 
isolated from illuminated leaves and sheaths of 5 week old 
plant. cDNA directionally cloned into vector. " 
57 a 54 c 56 g 47 t 2 others 



Query Match 42.4%; Score 180; DB 10; Length 216; 

Best Local Similarity 89.7%; Pred. No. l.le-43; 

Matches 192; Conservative 0; Mismatches 22; Indels 0; Gaps 0; 

Qy 36 atcatgaggaagcaatactcccctgtgctctacttctgcctgatggcccttgtcgtagct 95 

I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I 
Db 1 ATCATGAGGAAGCAATACTCCCCTGTGCTCTCCTTGTGCCTGATGNNCCTTGTCGTAGCT 60 

Qy 96 gctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaactcg 155 

j I I I I I I I I I 1 I I I I I I I I I I I I I I I I 1 I I I I I I I PI M I I I I I I I I I I I I I I 11 I I I I I 
Db 61 GCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAACTCG 120 

Qy 156 tacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcagcccc 215 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I M I I I I I I I II I I I I I I I I I 
Db 121 TACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAGCCCC 180 

Qy 216 tagctaggcggtggatccgagcctgtatcagaaa 249 

MM I I I I I I II II I I I M 

"Db 181 TAGCCAGGCGTGGAGCTGTATCAGAAATCAAAAA 214 



RESULT 15 
BG355157 



LOCUS BG355157 256 bp mRNA EST 06-MAR-2001 

DEFINITION 947043D12.yl 947 - 2 week shoot from Barkan lab Zea mays cDNA, mRNA 

sequence . 
ACCESSION BG355157 

VERSION BG355157.1 GI:13237143 

KEYWORDS EST. 
SOURCE Zea mays. 

ORGANISM Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
REFERENCE 1 (bases 1 to 256) 
AUTHORS Walbot,V. 

TITLE Maize ESTs from various cDNA libraries sequenced at Stanford 

University 
JOURNAL Unpublished (1*999) 
COMMENT Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email: walbot@stanford.edu 
Plate: 947043 row: D column: 12. 
FEATURES Location /Qualifiers 

source 1. .256 

/organism="Zea mays" 

/cultivar="B73" 

/db_xref ="taxon : 4 577 " 

/clone_lib="947 - 2 week shoot from Barkan lab" 
/tissue_type="leaf and stem, including leaf base" 
/dev_stage="2 week old seedling (3 leaves)" 
/lab_host="XLl-Blue" 

/note="0rgan: shoot; Vector: Lambda ZAP (pBlueScript SK-) ; 
Site_l: EcoRI; Site_2: Xhol; Directionally cloned using 
Stratagene ! s UniZap XR cDNA cloning kit with the 5 ! end 
at the EcoRI site. The library represents 8 x 10e5 
independent recombinant phage. The plants were greenhouse 
grown." 

BASE COUNT 70 a 49 c 71 g 66 t 

ORIGIN 



Query Match 33.5%; Score 142.4; DB 11; Length 256; 

Best Local Similarity 93.6%; Pred. No. 2.5e-32; 

Matches 161; Conservative 0; Mismatches 6; Indels 5; Gaps 1; 

Qy 92 agctgctatggtctgtgtcatgtacaccacctcggcacaagcaggaaggagtggctacaa 151 

I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 1 ! I I I I I I I I I I I 
Db 2 AGCTGCTATGGTCTGTGTCATGTACACCACCTCGGCACAAGCAGGAAGGAGTGGCTACAA 61 

Qy 152 ctcgtacgaacctgatggaaggggtggatacaactctgttcccatcaacggcggtggcag 211 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I II I I I I I I I I 
Db 62 CTCGTACGAACCTGATGGAAGGAGTGGATACAACTCTGTTCCAATAAACGGCGGTGGCAG 121 

Qy 212 cccctagctaggcggtggatccgagcctgtatcagaaatcgaaataatataa 2 63 
I I I I I I I I I I I I I I I I II I I I I I II II I I I I I I I III I I I I I I I 



Db 122 CCCCTAGCCAGGCGGTG GAGCCTGTATCAGAAATCAAAAAAATATAA 168 



Search completed: February 7, 2002, 08:20:48 
Job time: 18125 sec ' ' 



